remotes::install_github("braverock/factorAnalytics", build_vignettes = TRUE, force = TRUE)
Downloading GitHub repo braverock/factorAnalytics@HEAD
These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?
1: All
2: CRAN packages only
3: None
4: R6 (2.4.1 -> 2.5.0 ) [CRAN]
5: testthat (2.3.2 -> 3.0.0 ) [CRAN]
6: digest (0.6.26 -> 0.6.27) [CRAN]
7: xfun (0.18 -> 0.19 ) [CRAN]
8: quantreg (5.74 -> 5.75 ) [CRAN]
✓ checking for file ‘/tmp/RtmpNUhB5u/remotes2a1c605fdfccd7/braverock-FactorAnalytics-e378067/DESCRIPTION’ (366ms)
─ preparing ‘FactorAnalytics’:
checking DESCRIPTION meta-information ...
✓ checking DESCRIPTION meta-information
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ looking to see if a ‘data/datalist’ file should be added
NB: this package now depends on R (>= 3.5.0)
WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/brkAnnual.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/brkMonthlyBloomberg.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/scoresSPGMI.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/scoresSPGMIraw.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/spxAnnual.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/stocksCRSP.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/stocksCRSPscoresSPGMI.rda’ WARNING: Added dependency on R >= 3.5.0 because serialized objects in serialize/load version 3 cannot be read in older versions of R. File(s) containing such objects: ‘FactorAnalytics/data/stocksCRSPscoresSPGMIraw.rda’
─ building ‘FactorAnalytics_2.0.40.tar.gz’
Installing package into ‘/home/smacanovic/R/x86_64-pc-linux-gnu-library/4.0’
(as ‘lib’ is unspecified)
* installing *source* package ‘FactorAnalytics’ ...
** using staged installation
** R
** data
** inst
** byte-compile and prepare package for lazy loading
rlm is already registered in the fit.models registry
covfm is already registered in the fit.models registry
Warning: replacing previous import ‘data.table::last’ by ‘xts::last’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::first’ by ‘xts::first’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::melt’ by ‘reshape2::melt’ when loading ‘FactorAnalytics’
Note: possible error in 'standardizeExposures(obj = rollingObject, ': unused argument (obj = rollingObject)
Note: possible error in 'extractRegressionStats(obj = rollingObject, ': unused argument (obj = rollingObject)
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
rlm is already registered in the fit.models registry
covfm is already registered in the fit.models registry
Warning: replacing previous import ‘data.table::last’ by ‘xts::last’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::first’ by ‘xts::first’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::melt’ by ‘reshape2::melt’ when loading ‘FactorAnalytics’
** testing if installed package can be loaded from final location
rlm is already registered in the fit.models registry
covfm is already registered in the fit.models registry
Warning: replacing previous import ‘data.table::last’ by ‘xts::last’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::first’ by ‘xts::first’ when loading ‘FactorAnalytics’
Warning: replacing previous import ‘data.table::melt’ by ‘reshape2::melt’ when loading ‘FactorAnalytics’
** testing if installed package keeps a record of temporary installation path
* DONE (FactorAnalytics)
pacman::p_load(tidyverse,tidyquant,FFdownload,FactorAnalytics,PerformanceAnalytics)
pacman::p_load(tidyverse,tidyquant,FFdownload,PortfolioAnalytics,nloptr)
pacman::p_load(tidyverse,tidyquant,PortfolioAnalytics,nloptr,tsibble,matrixcalc,Matrix,timetk,xts)
## Important Function
# fitTsfm
#
# fitTsfm(asset.names, factor.names, data, fit.method, variable.selection, ...):
#
# Fits a time series (a.k.a. macroeconomic) factor model for one or more asset returns or excess
# returns using time series regression. Least squares (LS), discounted least squares (DLS) and
# robust regression fitting are possible. Variable selection methods include "stepwise", "subsets" and "lars". An object of class "tsfm" containing the
# fitted objects, estimated coefficients, R-squared and residual volatility is returned.
Please remember to put your assignment solutions in rmd format using many chunks and putting readable text in between, similar to my examples given in Research Methods and Assignment 1!
For all exercises: Please use the Assignment-Forum to post your questions, I will try my best to help you along! If you follow the vignettes from factorAnalytics, wherever it says z.score=T, please exchange it for either z.score='crossSection' or z.score='timeSeries' depending on the task at hand.
In this exercise we want to estimate the CAPM. Please read carefully through the two documents provided (right hand side: files). Then we start to collect the necessary data:
LS&P100I (S&P 100): total return index (RI) and market cap (MV)FFdownload). From both datasets we select data for the last (available) 60 months, calculate returns (simple percentage) for the US-Stocks and eliminate those stocks that have NAs for this period.mutate and map in combination with lm). Estimate the mean-return for each stock and plot the return/beta-combinations. Create the security market line and include it in the plot! What do you find?pacman::p_load(tidyverse,tidyquant,FFdownload,PortfolioAnalytics,nloptr,readxl,quantmod,FFdownload,timetk, dplyr, xts)
As we have seen: the CAPM for small portfolios does not work very well, and so we start using portfolios that get rid of the idiosyncratic risk! Go to Kenneth French’s Homepage again and download the following datasets: “Portfolios Formed on Market Beta” (where we will use 10 monthly value weighted portfolios formed on beta) and “25 Portfolios Formed on Size and Market Beta” (same thing) as well as the market factor and rf (as before). Now we are going to check the CAPM like famous researchers have done it! We can use returns as they are in the files (simple returns)!
inputlist<-c("F-F_Research_Data_Faktors_CSV.zip","Portfolios_Formed_on_BETA_CSV.zip")
#Now process only these files if they can be matched (download only)
FFdownload(output_file = "FFdata.RData", inputlist = inputlist, exclude_daily=TRUE)
Step 1: getting list of all the csv-zip-files!
Step 2: Downloading 2 zip-files
trying URL 'http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip'
trying URL 'http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Portfolios_Formed_on_BETA_CSV.zip'
Step 3: Start processing 2 csv-files
|
| | 0%
|
|====================================================== | 50%
|
|===========================================================================================================| 100%
load("FFdata.RData")
portf_mkt_betatest<-(FFdownload$x_Portfolios_Formed_on_BETA$monthly$value_weighted_returns)
portf_mkt_betatest
Lo.20 Qnt.2 Qnt.3 Qnt.4 Hi.20 Lo.10 Dec.2 Dec.3 Dec.4 Dec.5 Dec.6 Dec.7 Dec.8 Dec.9 Hi.10
Jul 1963 1.13 -0.08 -0.97 -0.94 -1.41 1.35 0.77 0.08 -0.24 -0.69 -1.20 -0.49 -1.39 -1.94 -0.77
Aug 1963 3.66 4.77 6.46 6.23 9.69 3.52 3.89 4.29 5.25 5.23 7.55 7.57 4.91 9.04 10.47
Sep 1963 -2.78 -0.76 -0.78 -0.81 -2.73 -3.09 -2.24 -0.54 -0.97 -1.37 -0.27 -0.63 -1.00 -1.92 -3.68
Oct 1963 0.74 3.56 2.03 5.70 3.06 1.25 -0.12 2.00 5.12 2.32 1.78 6.63 4.78 3.10 3.01
Nov 1963 -0.63 -0.26 -0.81 -0.92 0.13 -0.91 -0.15 1.60 -2.05 -0.94 -0.69 -1.32 -0.51 -0.20 0.52
Dec 1963 2.66 2.07 2.66 2.34 0.75 3.86 0.63 2.31 1.83 3.00 2.36 1.25 3.45 0.30 1.28
Jan 1964 2.89 2.75 2.40 3.62 1.56 3.74 1.41 2.74 2.76 1.76 2.95 2.88 4.35 2.07 0.96
Feb 1964 1.31 1.38 1.98 2.77 3.15 1.08 1.72 0.62 2.14 2.36 1.66 1.74 3.78 2.53 3.89
Mar 1964 0.95 0.63 2.26 3.11 6.00 1.05 0.76 0.05 1.20 2.92 1.69 2.28 3.90 5.89 6.12
Apr 1964 2.35 1.48 -0.38 -2.46 -3.26 2.65 1.81 1.57 1.38 -0.89 0.07 -2.26 -2.65 -1.79 -4.98
May 1964 1.67 1.53 1.42 1.93 1.37 1.64 1.71 0.23 2.81 2.08 0.84 0.71 3.09 1.41 1.33
Jun 1964 1.76 2.04 1.27 1.00 1.92 1.52 2.20 3.81 0.34 1.14 1.39 2.07 0.01 2.14 1.66
Jul 1964 1.38 3.64 2.23 0.18 0.95 4.07 0.67 4.86 2.24 3.43 1.02 1.07 -0.54 1.22 0.48
Aug 1964 -0.96 -0.49 -1.28 -2.12 -2.09 1.34 -1.59 0.13 -1.23 -1.92 -0.62 -1.56 -2.59 -1.79 -2.65
Sep 1964 1.55 3.35 3.65 2.99 5.73 2.88 1.17 3.48 3.19 3.58 3.72 3.08 2.91 5.75 5.68
Oct 1964 1.00 1.58 1.32 -1.10 0.43 1.17 0.95 0.94 2.33 1.46 1.18 0.64 -2.54 -0.42 1.97
Nov 1964 0.35 0.27 1.45 -0.89 -2.73 1.30 0.08 -0.26 0.89 1.96 0.93 -0.09 -1.57 -2.96 -2.31
Dec 1964 1.50 0.18 0.06 -0.37 -0.80 -0.11 1.96 0.58 -0.28 0.09 0.04 -1.00 0.19 -1.00 -0.45
Jan 1965 1.29 3.67 4.65 6.41 6.81 2.89 0.84 4.02 3.27 5.55 3.71 4.84 7.76 6.19 7.90
Feb 1965 -0.35 -0.69 0.84 3.07 4.34 1.17 -0.79 -0.01 -1.49 -0.43 2.19 3.00 3.12 3.86 5.18
Mar 1965 -1.41 -0.42 -2.08 -0.69 -0.33 -0.01 -1.81 0.09 -1.03 -2.62 -1.53 -1.15 -0.30 -0.20 -0.56
Apr 1965 1.98 3.83 4.05 3.89 5.13 1.29 2.19 4.80 2.67 2.07 6.06 2.06 5.40 5.31 4.79
May 1965 0.86 -1.69 0.29 -0.53 -1.40 -0.10 1.14 -2.46 -0.74 2.11 -1.50 -0.49 -0.57 -1.49 -1.24
Jun 1965 -3.79 -4.05 -5.06 -6.16 -9.29 -5.06 -3.42 -5.44 -2.36 -5.21 -4.91 -6.62 -5.80 -8.59 -10.53
Jul 1965 1.92 0.83 0.81 2.45 5.59 2.10 1.78 1.48 0.50 0.65 0.97 0.80 3.64 4.62 6.67
Aug 1965 1.29 2.52 4.11 4.36 7.40 1.16 1.39 1.62 3.00 3.63 4.58 3.47 4.97 5.55 9.43
Sep 1965 4.49 2.17 2.42 2.22 5.51 4.92 4.18 3.13 1.68 2.78 2.08 2.48 2.04 5.65 5.37
Oct 1965 2.55 1.88 2.64 4.05 6.96 1.76 3.14 2.89 1.35 2.14 3.13 3.89 4.17 5.99 7.99
Nov 1965 -0.77 -2.07 1.37 2.14 6.46 1.35 -2.32 -0.79 -2.76 2.02 0.74 3.27 1.36 4.69 8.30
Dec 1965 0.58 1.09 0.92 1.70 4.76 0.63 0.54 0.71 1.29 0.67 1.17 3.42 0.48 4.44 5.07
Jan 1966 0.39 -0.34 0.87 1.93 6.28 0.98 -0.06 0.04 -0.55 1.61 0.14 1.97 1.90 6.11 6.44
Feb 1966 -2.91 -2.03 -0.86 1.39 5.70 -3.51 -2.44 -3.03 -1.49 -1.01 -0.70 -0.45 2.72 3.07 8.34
Mar 1966 -2.41 -2.20 -3.15 -0.86 -0.65 -1.34 -3.23 -3.04 -1.74 -2.56 -3.75 -2.00 -0.06 0.04 -1.30
Apr 1966 1.37 1.51 1.75 4.78 5.36 2.70 0.33 0.31 2.15 1.44 2.05 5.15 4.51 5.35 5.37
May 1966 -4.78 -3.59 -4.99 -5.16 -11.16 -4.84 -4.73 -3.84 -3.47 -4.97 -5.01 -5.22 -5.12 -9.91 -12.37
Jun 1966 -2.04 -1.39 -1.73 0.09 1.99 -2.09 -2.00 -2.95 -0.58 -1.51 -1.95 1.25 -0.71 0.32 3.64
Jul 1966 -0.55 -1.05 -0.17 -2.74 -2.11 -0.36 -0.68 -0.48 -1.38 -0.32 -0.05 -2.14 -3.26 -2.32 -1.94
Aug 1966 -6.72 -7.59 -7.62 -7.28 -8.71 -6.88 -6.61 -8.11 -7.29 -7.04 -8.08 -5.42 -8.89 -8.62 -8.79
Sep 1966 1.61 0.17 -0.94 -2.27 -5.22 1.59 1.63 0.16 0.18 -1.67 -0.35 -2.59 -1.99 -5.03 -5.38
Oct 1966 7.44 5.05 4.68 2.47 -2.79 5.78 8.52 7.23 3.81 5.31 4.17 2.03 2.86 -0.61 -4.57
Nov 1966 -1.50 -0.82 1.87 6.04 10.94 -1.09 -1.76 0.05 -1.33 1.48 2.17 2.15 9.47 8.49 13.02
Dec 1966 1.15 -0.12 1.45 -0.53 0.57 -0.16 1.98 1.03 -0.81 2.10 0.93 -0.81 -0.30 1.68 -0.34
Jan 1967 4.98 7.80 8.56 11.01 14.38 3.78 5.73 7.42 8.03 9.56 7.74 10.99 11.03 13.48 15.12
Feb 1967 -0.02 0.30 0.80 3.33 2.37 -0.89 0.51 0.64 0.09 1.20 0.47 1.40 4.90 1.32 3.23
Mar 1967 3.61 3.12 5.38 5.34 5.64 4.54 3.04 2.93 3.24 6.20 4.71 5.59 5.15 5.86 5.46
Apr 1967 1.24 4.79 3.90 6.28 6.21 1.22 1.25 2.78 6.00 4.13 3.71 5.85 6.62 5.26 6.97
May 1967 -3.36 -4.46 -4.06 -4.18 -3.92 -1.86 -4.28 -2.68 -5.50 -3.24 -4.74 -4.27 -4.11 -5.80 -2.44
Jun 1967 1.99 1.01 2.92 3.86 4.44 1.85 2.08 1.54 0.69 3.02 2.83 2.82 4.67 4.01 4.76
Jul 1967 2.37 5.97 5.02 7.38 7.80 0.48 4.48 5.30 6.39 6.18 3.65 7.04 7.68 6.07 9.50
Aug 1967 -1.17 -0.33 -0.45 0.10 -0.37 -1.27 -1.07 0.20 -0.64 0.32 -1.37 -0.39 0.54 -0.45 -0.29
Sep 1967 2.33 2.54 5.02 2.70 3.97 1.29 3.45 1.62 3.09 3.66 6.67 2.74 2.67 4.61 3.37
Oct 1967 -2.62 -4.06 -1.63 -3.49 -2.68 -3.69 -1.49 -4.67 -3.71 -5.18 2.55 -4.57 -2.56 -1.93 -3.41
Nov 1967 1.32 1.06 0.16 0.65 0.02 0.89 1.75 0.77 1.23 -0.81 1.21 2.22 -0.69 -0.94 0.97
Dec 1967 2.35 3.10 3.15 4.09 5.37 2.56 2.14 5.39 1.79 3.23 3.06 6.24 2.21 4.52 6.19
Jan 1968 -0.67 -3.13 -4.75 -6.50 -5.94 0.30 -1.66 -1.05 -4.35 -3.06 -6.56 -4.74 -8.10 -5.58 -6.29
Feb 1968 -1.98 -3.06 -2.99 -4.05 -6.69 -1.95 -2.01 -2.74 -3.26 -2.63 -3.39 -3.61 -4.46 -6.67 -6.72
Mar 1968 -0.86 0.45 1.82 0.93 2.52 -1.26 -0.44 -0.26 0.88 0.16 3.66 0.48 1.37 2.74 2.33
Apr 1968 4.82 9.19 10.78 12.37 15.00 4.65 5.01 8.21 9.77 10.23 11.35 11.89 12.82 14.07 15.87
May 1968 0.76 2.16 2.94 3.87 6.60 1.05 0.46 2.82 1.78 1.55 4.39 5.60 2.27 7.04 6.21
Jun 1968 3.20 1.60 0.23 -0.75 -0.50 4.27 2.09 4.10 0.12 0.99 -0.53 0.66 -2.10 0.70 -1.61
Jul 1968 1.70 -3.92 -3.96 -4.56 -4.12 3.23 -0.55 -4.22 -3.69 -3.60 -4.53 -4.95 -4.04 -3.92 -4.32
Aug 1968 1.09 1.72 1.57 3.49 2.92 0.95 1.31 2.21 1.33 1.41 1.82 3.98 2.82 2.05 3.78
Sep 1968 2.73 4.05 5.31 5.47 7.40 1.75 4.26 4.95 3.35 5.12 5.61 5.24 5.79 6.26 8.53
Oct 1968 2.10 -1.02 1.98 1.06 -0.02 2.43 1.60 -1.03 -1.01 3.02 0.40 0.66 1.60 1.24 -1.23
Nov 1968 6.38 5.34 3.85 6.02 9.82 6.88 5.61 3.79 6.57 1.99 6.74 5.40 6.88 8.76 10.87
Dec 1968 -4.05 -3.69 -2.96 -3.98 -2.45 -4.67 -3.08 -2.79 -4.39 -2.73 -3.29 -4.62 -3.13 -2.90 -2.01
[ reached getOption("max.print") -- omitted 621 rows ]
#Download the Portfolios from Kenneth French's Homepage
portf_mkt_beta <- "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Portfolios_Formed_on_BETA_CSV.zip"
portf_mkt_beta_csv <- "Portfolios_Formed_on_BETA.csv"
temp <- tempfile()
download.file(portf_mkt_beta, temp, quiet = TRUE)
portf_mkt_beta <- read_csv(unz(temp, portf_mkt_beta_csv), skip = 15, quote = "\",") %>%
dplyr::rename(date = "X1") %>%
mutate_at(vars(-date), as.numeric) %>%
mutate(date = rollback(ymd(parse_date_time(date, "%Y%m") + months(1))))%>%
filter(date >= first('1964-01-01') & date <= '2019-12-31')
Missing column names filled in: 'X1' [1]
── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────
cols(
X1 = col_character(),
`Lo 20` = col_character(),
`Qnt 2` = col_character(),
`Qnt 3` = col_character(),
`Qnt 4` = col_character(),
`Hi 20` = col_character(),
`Lo 10` = col_character(),
`Dec 2` = col_character(),
`Dec 3` = col_character(),
`Dec 4` = col_character(),
`Dec 5` = col_character(),
`Dec 6` = col_character(),
`Dec 7` = col_character(),
`Dec 8` = col_character(),
`Dec 9` = col_character(),
`Hi 10` = col_character()
)
7 parsing failures.
row col expected actual file
688 -- 16 columns 1 columns <connection>
1377 -- 16 columns 1 columns <connection>
1435 -- 16 columns 1 columns <connection>
1493 -- 16 columns 1 columns <connection>
2182 -- 16 columns 1 columns <connection>
.... ... .......... ......... ............
See problems(...) for more details.
Problem with `mutate()` input `Lo 20`.
ℹ NAs introduced by coercion
ℹ Input `Lo 20` is `.Primitive("as.double")(`Lo 20`)`.NAs introduced by coercionProblem with `mutate()` input `Qnt 2`.
ℹ NAs introduced by coercion
ℹ Input `Qnt 2` is `.Primitive("as.double")(`Qnt 2`)`.NAs introduced by coercionProblem with `mutate()` input `Qnt 3`.
ℹ NAs introduced by coercion
ℹ Input `Qnt 3` is `.Primitive("as.double")(`Qnt 3`)`.NAs introduced by coercionProblem with `mutate()` input `Qnt 4`.
ℹ NAs introduced by coercion
ℹ Input `Qnt 4` is `.Primitive("as.double")(`Qnt 4`)`.NAs introduced by coercionProblem with `mutate()` input `Hi 20`.
ℹ NAs introduced by coercion
ℹ Input `Hi 20` is `.Primitive("as.double")(`Hi 20`)`.NAs introduced by coercionProblem with `mutate()` input `Lo 10`.
ℹ NAs introduced by coercion
ℹ Input `Lo 10` is `.Primitive("as.double")(`Lo 10`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 2`.
ℹ NAs introduced by coercion
ℹ Input `Dec 2` is `.Primitive("as.double")(`Dec 2`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 3`.
ℹ NAs introduced by coercion
ℹ Input `Dec 3` is `.Primitive("as.double")(`Dec 3`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 4`.
ℹ NAs introduced by coercion
ℹ Input `Dec 4` is `.Primitive("as.double")(`Dec 4`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 5`.
ℹ NAs introduced by coercion
ℹ Input `Dec 5` is `.Primitive("as.double")(`Dec 5`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 6`.
ℹ NAs introduced by coercion
ℹ Input `Dec 6` is `.Primitive("as.double")(`Dec 6`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 7`.
ℹ NAs introduced by coercion
ℹ Input `Dec 7` is `.Primitive("as.double")(`Dec 7`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 8`.
ℹ NAs introduced by coercion
ℹ Input `Dec 8` is `.Primitive("as.double")(`Dec 8`)`.NAs introduced by coercionProblem with `mutate()` input `Dec 9`.
ℹ NAs introduced by coercion
ℹ Input `Dec 9` is `.Primitive("as.double")(`Dec 9`)`.NAs introduced by coercionProblem with `mutate()` input `Hi 10`.
ℹ NAs introduced by coercion
ℹ Input `Hi 10` is `.Primitive("as.double")(`Hi 10`)`.NAs introduced by coercion 177 failed to parse.
#Download the market factor and rf (Fama/French 3 Research Factors)
mkt_factors <- "https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip"
mkt_factors_csv <- "F-F_Research_Data_Factors.CSV"
temp <- tempfile()
download.file(mkt_factors, temp, quiet = TRUE)
mkt_factors <- read_csv(unz(temp, mkt_factors_csv), skip = 3, quote = "\",") %>%
dplyr::rename(date = X1) %>%
mutate_at(vars(-date), as.numeric) %>%
mutate(date = rollback(ymd(parse_date_time(date, "%Y%m") + months(1)))) %>%
filter(date >= first('1964-01-01') & date <= '2019-12-31')
Missing column names filled in: 'X1' [1]
── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────
cols(
X1 = col_double(),
`Mkt-RF` = col_double(),
SMB = col_double(),
HML = col_double(),
RF = col_double()
)
8 parsing failures.
row col expected actual file
1132 X1 a double Annual Factors: January-December <connection>
1132 NA 5 columns 1 columns <connection>
1133 Mkt-RF a double Mkt-RF <connection>
1133 SMB a double SMB <connection>
1133 HML a double HML <connection>
.... ...... ......... ................................ ............
See problems(...) for more details.
93 failed to parse.
Subtract the risk-free rate from the first set of 10 portfolios (only sorted on beta) (Lo 10,., Hi 10) and estimate each stocks beta with the market.
#join data
ten_portf <- portf_mkt_beta[1:672, -c(2:6)]
ten_portf_joined <- left_join(mkt_factors, ten_portf)
Joining, by = "date"
mkt_factors
ten_portf
ten_portf_joined
NA
#substract Risk-Free-Rate
ten_portf_rf <- mutate(ten_portf_joined, Lo10rf = Lo10 - RF, Dec2rf = Dec2 - RF, Dec3rf = Dec3 - RF, Dec4rf = Dec4 - RF, Dec5rf = Dec5 -RF, Dec6rf = Dec6 - RF, Dec7rf = Dec7 - RF, Dec8rf = Dec8 - RF, De9rf = Dec9 - RF, Hi10rf = Hi10 - RF)
ten_portf_rf <- ten_portf_rf[-2:-15]
view(ten_portf_rf)
ten_portf_rf
Non-numeric columns being dropped: date
?lm()
#Calculate Betas for each portfolio
betas_ten_portf_lm <- lm(ten_portf_rf_xts ~ mkt_factors_xts[, 1])
betas_ten_portf_lm
Call:
lm(formula = ten_portf_rf_xts ~ mkt_factors_xts[, 1])
Coefficients:
Lo10rf Dec2rf Dec3rf Dec4rf Dec5rf Dec6rf Dec7rf Dec8rf De9rf
(Intercept) 0.23458 0.13725 0.13813 0.15888 0.01110 0.05621 -0.09574 -0.01230 -0.11817
mkt_factors_xts[, 1] 0.61008 0.73518 0.83572 0.97113 1.02007 1.08413 1.15670 1.27953 1.38876
Hi10rf
(Intercept) -0.24513
mkt_factors_xts[, 1] 1.61253
betas_ten_portf <- CAPM.beta(Ra = ten_portf_rf_xts, Rb = mkt_factors_xts[, 1], Rf = 0)
betas_ten_portf
Lo10rf Dec2rf Dec3rf Dec4rf Dec5rf Dec6rf Dec7rf Dec8rf De9rf Hi10rf
Beta: Mkt-RF 0.6100834 0.7351849 0.8357215 0.971132 1.020072 1.084128 1.156701 1.279531 1.38876 1.612531
Estimate the mean-return for each stock and plot the return/beta-combinations.
#Estimate Mean Return
mean_ten_portf_rf_xts <- as.data.frame(lapply(ten_portf_rf_xts, FUN=mean))
mean_ten_portf_rf_xts
#Plot the return/beta-combinations
plot.default(x = betas_ten_portf, xlim=c(0, 2),
y = mean_ten_portf_rf_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations")
Create the security market line and include it in the plot! What do you find?
mean_mkt <- as.data.frame(lapply(mkt_factors_xts[, 1], FUN=mean))
y_mkt <- mean_mkt[1, 1]
plot.default(x = betas_ten_portf, xlim=c(0, 2),
y = mean_ten_portf_rf_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations",
abline(0, y_mkt))
plot.default(x = betas_ten_portf, xlim=c(0, 2),
y = mean_ten_portf_rf_xts, ylim=c(0, 10),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations",
abline(0, y_mkt))
#summary
summary_CAPM_ten_portf <- (table.CAPM(Ra = ten_portf_rf_xts, Rb = mkt_factors_xts[, 1], Rf = 0)[1:9, ])
(You can split the file in 2-3 different time blocks and see if something changes). * Now we are done with the first-pass regression.*
#look for first 10 years
ten_portf_rf_10yrs_xts <- ten_portf_rf[1:120, ] %>%
tk_xts(date_var = date, silent = TRUE)
betas_ten_portf_rf_10yrs <- CAPM.beta(Ra = ten_portf_rf_10yrs_xts, Rb = mkt_factors_xts[1:120, 1], Rf = 0)
mean_ten_portf_rf_10yrs_xts <- as.data.frame(lapply(ten_portf_rf_10yrs_xts, FUN=mean))
mean_mkt_10yrs <- as.data.frame(lapply(mkt_factors_xts[1:120, 1], FUN=mean))
y_mkt_10yrs <- mean_mkt_10yrs[1, 1]
plot.default(x = betas_ten_portf_rf_10yrs, xlim=c(0, 2),
y = mean_ten_portf_rf_10yrs_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations 1964-1974",
abline(0, y_mkt_10yrs))
summary_CAPM_ten_portf_10yrs <- (table.CAPM(Ra = ten_portf_rf_xts[1:120, ], Rb = mkt_factors_xts[1:120, 1], Rf = 0)[1:9, ])
summary_CAPM_ten_portf_10yrs
betas_ten_portf
Lo10rf Dec2rf Dec3rf Dec4rf Dec5rf Dec6rf Dec7rf Dec8rf De9rf Hi10rf
Beta: Mkt-RF 0.6100834 0.7351849 0.8357215 0.971132 1.020072 1.084128 1.156701 1.279531 1.38876 1.612531
#There are a number of reasons we expect might the CAPM to fail:
#1. Imperfect measures of the market portfolio
#2. Beta is an incomplete measure of risk
#3. Tax effects
#4. Non - normality of returns
#5. No riskless asset
#6. Divergent borrowing and lending rates
There are a number of reasons we expect might the CAPM to fail: 1. Imperfect measures of the market portfolio 2. Beta is an incomplete measure of risk 3. Tax effects 4. Non - normality of returns 5. No riskless asset 6. Divergent borrowing and lending rates
#Look at a) -> We now do it with the mean return of every portfolio combined...
#1964-2019
com_mean_ten_portf_rf <- sum(mean_ten_portf_rf_xts)/10
mean_betas_ten_portf <- sum(betas_ten_portf)/10
plot.default(x = mean_betas_ten_portf, xlim=c(0, 2),
y = com_mean_ten_portf_rf, ylim=c(0, 2),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations 10 Portfolios 1964-2019",
abline(0, y_mkt))
NA
NA
inputlist1<-c("F-F_Research_Data_Faktors_CSV.zip","25_Portfolios_Formed_on_Size_and Market_Beta_CSV.zip")
#Now process only these files if they can be matched (download only)
FFdownload(output_file = "FFdata.RData", inputlist = inputlist1, exclude_daily=TRUE)
Step 1: getting list of all the csv-zip-files!
Step 2: Downloading 2 zip-files
trying URL 'http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/F-F_Research_Data_Factors_CSV.zip'
trying URL 'http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ftp/Portfolios_Formed_on_ME_Wout_Div_CSV.zip'
Step 3: Start processing 2 csv-files
|
| | 0%
|
|====================================================== | 50%
|
|===========================================================================================================| 100%
load("FFdata.RData")
twentyfive_portf<-(FFdownload$x_25_Portfolios_Formed_on_Size_and_Market_Beta$monthly$value_weighted_returns)
twentyfive_portf
NULL
download.file(portf_mkt_beta, temp, quiet = TRUE)
Error in download.file(portf_mkt_beta, temp, quiet = TRUE) :
invalid 'url' argument
twentyfive_portf
NULL
#join data
twentyfive_portf <- portf_mkt_beta[1:672, -c(7:16)]
twentyfive_portf_joined <- left_join(mkt_factors, twentyfive_portf)
Joining, by = "date"
#substract Risk-Free-Rate
twentyfive_portf_rf <- mutate(twentyfive_portf_joined, Lo20rf = Lo20 - RF, Qnt2rf = Qnt2 - RF, Qnt3rf = Qnt3 - RF, Qnt4rf = Qnt4 - RF, Hi20rf = Hi20 - RF)
twentyfive_portf_rf <- twentyfive_portf_rf[-2:-10]
#substract Risk-Free-Rate
ten_portf_rf <- mutate(ten_portf_joined, Lo10rf = Lo10 - RF, Dec2rf = Dec2 - RF, Dec3rf = Dec3 - RF, Dec4rf = Dec4 - RF, Dec5rf = Dec5 -RF, Dec6rf = Dec6 - RF, Dec7rf = Dec7 - RF, Dec8rf = Dec8 - RF, De9rf = Dec9 - RF, Hi10rf = Hi10 - RF)
ten_portf_rf <- ten_portf_rf[-2:-15]
view(ten_portf_rf)
ten_portf_rf
Non-numeric columns being dropped: date
?lm()
#Calculate Betas for each portfolio
betas_ten_portf_lm <- lm(ten_portf_rf_xts ~ mkt_factors_xts[, 1])
betas_ten_portf_lm
Call:
lm(formula = ten_portf_rf_xts ~ mkt_factors_xts[, 1])
Coefficients:
Lo10rf Dec2rf Dec3rf Dec4rf Dec5rf Dec6rf Dec7rf Dec8rf De9rf
(Intercept) 0.23458 0.13725 0.13813 0.15888 0.01110 0.05621 -0.09574 -0.01230 -0.11817
mkt_factors_xts[, 1] 0.61008 0.73518 0.83572 0.97113 1.02007 1.08413 1.15670 1.27953 1.38876
Hi10rf
(Intercept) -0.24513
mkt_factors_xts[, 1] 1.61253
betas_ten_portf <- CAPM.beta(Ra = ten_portf_rf_xts, Rb = mkt_factors_xts[, 1], Rf = 0)
betas_ten_portf
Lo10rf Dec2rf Dec3rf Dec4rf Dec5rf Dec6rf Dec7rf Dec8rf De9rf Hi10rf
Beta: Mkt-RF 0.6100834 0.7351849 0.8357215 0.971132 1.020072 1.084128 1.156701 1.279531 1.38876 1.612531
Estimate the mean-return for each stock and plot the return/beta-combinations.
#Estimate Mean Return
mean_ten_portf_rf_xts <- as.data.frame(lapply(ten_portf_rf_xts, FUN=mean))
mean_ten_portf_rf_xts
#Plot the return/beta-combinations
plot.default(x = betas_ten_portf, xlim=c(0, 2),
y = mean_ten_portf_rf_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations")
Create the security market line and include it in the plot! What do you find?
mean_mkt <- as.data.frame(lapply(mkt_factors_xts[, 1], FUN=mean))
y_mkt <- mean_mkt[1, 1]
plot.default(x = betas_ten_portf, xlim=c(0, 2),
y = mean_ten_portf_rf_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations",
abline(0, y_mkt))
#
(You can split the file in 2-3 different time blocks and see if something changes). * Now we are done with the first-pass regression.*
#look for first 10 years
ten_portf_rf_10yrs_xts <- ten_portf_rf[1:120, ] %>%
tk_xts(date_var = date, silent = TRUE)
betas_ten_portf_rf_10yrs <- CAPM.beta(Ra = ten_portf_rf_10yrs_xts, Rb = mkt_factors_xts[1:120, 1], Rf = 0)
mean_ten_portf_rf_10yrs_xts <- as.data.frame(lapply(ten_portf_rf_10yrs_xts, FUN=mean))
mean_mkt_10yrs <- as.data.frame(lapply(mkt_factors_xts[1:120, 1], FUN=mean))
y_mkt_10yrs <- mean_mkt_10yrs[1, 1]
plot.default(x = betas_ten_portf_rf_10yrs, xlim=c(0, 2),
y = mean_ten_portf_rf_10yrs_xts, ylim=c(0, 1),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations 1964-1974",
abline(0, y_mkt_10yrs))
summary_CAPM_ten_portf_10yrs <- (table.CAPM(Ra = ten_portf_rf_xts[1:120, ], Rb = mkt_factors_xts[1:120, 1], Rf = 0)[1:9, ])
summary_CAPM_ten_portf_10yrs
There are a number of reasons we expect might the CAPM to fail: 1. Imperfect measures of the market portfolio 2. Beta is an incomplete measure of risk 3. Tax effects 4. Non - normality of returns 5. No riskless asset 6. Divergent borrowing and lending rates
#Look at a) -> We now do it with the mean return of every portfolio combined...
#1964-2019
com_mean_ten_portf_rf <- sum(mean_ten_portf_rf_xts)/10
mean_betas_ten_portf <- sum(betas_ten_portf)/10
plot.default(x = mean_betas_ten_portf, xlim=c(0, 2),
y = com_mean_ten_portf_rf, ylim=c(0, 2),
xlab = "Beta", ylab = "Mean Return",
main = "Return/Beta-combinations 10 Portfolios 1964-2019",
abline(0, y_mkt))
NA
NA
Error in merge(residual_1964_2018[1, 1], residual_1964_1974[1, 1]) :
object 'residual_1964_2018' not found
#join data
twentyfive_portf <- portf_mkt_beta[1:672, -c(7:16)]
twentyfive_portf_joined <- left_join(mkt_factors, twentyfive_portf)
Joining, by = "date"
#substract Risk-Free-Rate
twentyfive_portf_rf <- mutate(twentyfive_portf_joined, Lo20rf = Lo20 - RF, Qnt2rf = Qnt2 - RF, Qnt3rf = Qnt3 - RF, Qnt4rf = Qnt4 - RF, Hi20rf = Hi20 - RF)
twentyfive_portf_rf <- twentyfive_portf_rf[-2:-10]
The purpose of the further assignments is less programming-related (you can copy most of the code), but to receive a positive grade I want you to dig into the referenced literature and be able to explain everything that you do very detailed in the text and your presentation (what do you do, what is the result and how do you intepret the result). After doing this - given the data - you should be perfectly able to estimate/interpret any type of factor model.
library(timetk)
SP500 <- tq_index("SP500")
Getting holdings for SP500
NASDAQ <- tq_exchange("NASDAQ")
Getting data...
NYSE <- tq_exchange("NYSE")
Getting data...
stocks.selection <- SP500 %>%
inner_join(rbind(NYSE,NASDAQ) %>% select(symbol,last.sale.price,market.cap,ipo.year),by=c("symbol")) %>%
filter(ipo.year<2000&!is.na(market.cap)) %>%
arrange(desc(weight)) %>%
slice(1:10)
stocks.selection
These are the returns of the selected stocks.
stocks.returns <- stocks.selection$symbol %>%
tq_get(get = "stock.prices",
from = "2000-01-01",
to = "2019-12-31") %>%
group_by(symbol) %>%
tq_transmute(select = adjusted,
mutate_fun = periodReturn,
period = "monthly")
stocks.returns
These are the stocks return in the xts format and also in a wide format
stocks.returns.xts <- stocks.returns%>%
subset( select = c(symbol,date, monthly.returns)) %>%
pivot_wider(names_from = symbol,
values_from = monthly.returns) %>%
tk_xts(date_var = date, silent = TRUE)
colnames(stocks.returns.xts)
[1] "AAPL" "MSFT" "AMZN" "NVDA" "ADBE" "CSCO" "QCOM" "AMGN" "UPS" "ORCL"
Principal Component Analysis (PCA): uses the eigen decomposition of the covariance matrix of asset returns to find the first K principal components that explain the largest portion of the sample covariance matrix returns. Factor loading are then estimated using time series regression. Foctor analysis involves maximum likelihood optimization to estimate the factor loadings and the residual covarince matrix, constructing the factor realization and choosing a rotation of the coordinate system for a more meeaningful interpretion of factors
used when T>N T: Number of observations N: Number of assets
if N>T then Aymptotic Principal Component Analysis (APCA)
Fitting a statistical factor model with two principal components (k=2)
fit.pca <- fitSfm(stocks.returns.xts, k=2)
fit.pca
Call:
fitSfm(data = stocks.returns.xts, k = 2)
Model dimensions:
Factors Assets Periods
2 10 240
Factor Loadings:
F.1 F.2
Min. :0.07591 Min. :-0.55419
1st Qu.:0.20299 1st Qu.:-0.19298
Median :0.26045 Median :-0.12929
Mean :0.27738 Mean :-0.09803
3rd Qu.:0.33313 3rd Qu.:-0.07555
Max. :0.63274 Max. : 0.69775
R-squared values:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.1456 0.3751 0.4340 0.4702 0.5257 0.9705
Residual Volatilities:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.03091 0.06662 0.06980 0.06779 0.07759 0.08733
Screenplot of eigenvalues An eigenvector of a linear transformation is a nonzero vector that changes by a scalar factor when that linear transformation is applied to it. Eigenvalues and eigenvectors allow us to “reduce” a linear operation to separate, simpler, problems.
plot(fit.pca, which=1, eig.max = 0.9)
First principal component explains about 48% of total variance. The first two components explain about 61% of total variance.
Now plotting the estimated factor returns
plot(fit.pca, which=2)
Estimated factor loadings for all assets Factor loading is the correlation coefficient for the variable and factor.
plot(fit.pca, which=3, a.sub=1:10)
First factor has all positive loadings, whereas the second factor has both positive and negative loadings.
t(fit.pca$mimic)
AAPL MSFT AMZN NVDA ADBE CSCO QCOM AMGN UPS ORCL
F.1 0.1151760 0.07146734 0.1385313 0.2281127 0.12173965 0.0985711 0.07831731 0.02736584 0.03149781 0.08922093
F.2 0.0280622 0.14630514 0.5653233 -0.7117659 0.06546406 0.2048660 0.29957387 0.17283115 0.11186325 0.11747694
plot(fit.pca, which=12, n.top=3)
This figure displays the top three assets with the largest and smalles weights in each factor mimicking portfolio. For the first factor, NVIDA, Amazone and Adobe have the highest weights and Amgen, UPS and Microsoft have the lowest weights. Since all weights are positive this might be construed as a market-wide factor. For the second factor, Amazon, Qualcom and Cisco have the highest weights and NVIDA, Apple and Adobe have the lowest weights.
Now plotting the correlations between assets with the top 3 largest and smallest positions in the F.1 factor mimicking portfolio
plot(fit.pca, which=13, f.sub=1, n.top=3)
Here we can see the correlations between assets with top 3 largest and smallest weight in the factor mimicking portfolio for the first principal component. Correlations are very different.
plot(fit.pca, which=13, f.sub=2, n.top=3)
Here we can see the correlations between assets with top 3 largest and smallest weight in the factor mimicking portfolio for the first principal component. Pretty high correlations overall.
all estimaded coefficients from PCA including intercept
coef(fit.pca)
(Intercept) F.1 F.2
AAPL 0.0094582331 0.31947608 -0.02750960
MSFT -0.0005719413 0.19823670 -0.14342409
AMZN 0.0026867443 0.38425922 -0.55419090
NVDA -0.0008413294 0.63274098 0.69774977
ADBE 0.0014031930 0.33768242 -0.06417494
CSCO -0.0084564481 0.27341732 -0.20083179
QCOM -0.0038768497 0.21723718 -0.29367465
AMGN 0.0053513795 0.07590759 -0.16942775
UPS 0.0016476700 0.08736888 -0.10966043
ORCL -0.0051641218 0.24748175 -0.11516358
compare returns data with fitted and residual values for AAPL: fit.pca
AAPL.ts <- merge(fit.pca$data[,1], fitted(fit.pca)[,1], residuals(fit.pca)[,1])
colnames(AAPL.ts) <- c("AAPL.return", "AAPL.fitted", "AAPL.residual")
tail(AAPL.ts)
AAPL.return AAPL.fitted AAPL.residual
2019-07-31 0.07639455 0.02642587 0.0499686870
2019-08-30 -0.01646123 -0.01632520 -0.0001360339
2019-09-30 0.07296155 0.02555255 0.0474089986
2019-10-31 0.11068450 0.05780629 0.0528782135
2019-11-29 0.07755414 0.05600977 0.0215443674
2019-12-30 0.09081378 0.04953200 0.0412817802
fitted(): returns an xts data object of the component of asset returns explained by the factor model
residuals(): returns xts data object with the component of asset returns not explained by the factor model
Summary for fit.pca with HAC standard errors (allows for heteroskedasticity and autocorrelation consistent estimates and standard errors)
sum.pca <- summary(fit.pca, se.type="HAC", n.top=3)
sum.pca$sum.list[[1]]
Call:
lm(formula = AAPL ~ f)
Residuals:
AAPL
Min -0.5191012
1Q -0.0364578
Median -0.0001118
3Q 0.0472573
Max 0.2924632
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.009458 0.006135 1.542 0.124
object$factorsF.1 0.319476 0.032295 9.892 <2e-16 ***
object$factorsF.2 -0.027510 0.058010 -0.474 0.636
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08733 on 237 degrees of freedom
Multiple R-squared: 0.4426, Adjusted R-squared: 0.4379
F-statistic: 94.08 on 2 and 237 DF, p-value: < 2.2e-16
factor mimicking portfolio weights
sum.pca$mimic.sum
$F.1
$F.2
NA
Omega <- fmCov(fit.pca)
# return correlation plot for all assets
plot(fit.pca, which=8, a.sub=1:10)
decomp <- fmSdDecomp(fit.pca)
#get the factor model standard deviation for all assets
decomp$Sd.fm
AAPL MSFT AMZN NVDA ADBE CSCO QCOM AMGN UPS ORCL
0.11674807 0.08388945 0.13624535 0.17908437 0.11433920 0.09770189 0.10508420 0.07378590 0.05756518 0.09477957
#get the component contribution to Sd
head(decomp$cSd)
F.1 F.2 Residuals
AAPL 0.05132537 0.0001013369 0.06532137
MSFT 0.02750210 0.0038334162 0.05255393
AMZN 0.06362562 0.0352408355 0.03737889
NVDA 0.13125002 0.0425001866 0.00533416
ADBE 0.05855000 0.0005630992 0.05522610
CSCO 0.04492149 0.0064537340 0.04632666
#plotting
plot(fit.pca, which=9, f.sub=1:2, a.sub=1:10)
decomp1 <- fmVaRDecomp(fit.pca)
#factor model Value-at-Risk
head(decomp1$VaR.fm)
AAPL MSFT AMZN NVDA ADBE CSCO
-0.1511336 -0.1230763 -0.1915657 -0.2249127 -0.1601538 -0.1525801
#Marginal factor contributions to VAR
head(decomp1$mVaR)
F.1 F.2 Residuals
AAPL -0.2685626 0.019684759 -0.7419490
MSFT -0.2139303 0.044635093 -1.1184894
AMZN -0.2210924 0.097203372 -0.7390325
NVDA -0.2914210 -0.051684810 -0.1441595
ADBE -0.2905442 0.003866864 -0.7776361
CSCO -0.2554710 0.074042153 -1.0086633
# plotting
plot(fit.pca, which=11, f.sub=1:2, a.sub=1:10)
####Expected Shorfall decomposition
decomp2 <- fmEsDecomp(fit.pca)
# factor model Expected Shortfall
head(decomp2$ES.fm)
AAPL MSFT AMZN NVDA ADBE CSCO
-0.2577662 -0.1714347 -0.2923700 -0.3138664 -0.2269914 -0.2307265
# percentage component contribution to ES
head(decomp2$pcES)
F.1 F.2 Residuals
AAPL 42.00512 0.3084239 57.686452
MSFT 34.03340 7.9302336 58.036365
AMZN 47.78847 27.1436084 25.067918
NVDA 81.55481 16.8396382 1.605556
ADBE 51.46756 0.8127492 47.719693
CSCO 31.64666 14.3307631 54.022580
# plotting
plot(fit.pca, which = 10, f.sub=1:2, a.sub=1:10)
Follow the file tsfmVignette.pdf and interpret your results.
In a time series or macroeconomic factor model, observable economic time series such as industrial production growth rate, interest rates, market returns and inflation are used as common factors that contribute to asset returns. - For example, the famous single index model by Sharpe (1964) uses the market excess return as the common factor (captures economy-wide or market risk) for all assets and the unexplained returns in the error term represents the non-market firm specific risk. - On the other hand, Chen et al. (1986) uses a multi-factor model to find that surprise inflation, the spread between long and short-term interest rates and between high and low grade bonds are significantly priced, while the market portfolio, aggregate consumption risk and oil price risk are not priced separately.
library(FactorAnalytics)
# The following examples primarily use the managers dataset from the PerformanceAnalytics package.
# It’s an "xts" data object with:
# - 132 observations of monthly returns
# - on 10 variables:
# - six hypothetical asset managers,
# - 1x dhec returns (Long-Short Equity hedge fund index)
# - 1x sp500 returns
# - US treasury bonds 10 years (will serve as explanatory factors)
# - US treasury bills 3 months (can be considered as the risk free rate)
# - there are some "not available" observations (start day!)
data(managers)
Warning in tclass.xts(x) : index does not have a ‘tclass’ attribute
Warning in tclass.xts(x) : index does not have a ‘tclass’ attribute
Warning in tclass.xts(object) :
index does not have a ‘tclass’ attribute
Warning in tzone.xts(object) : index does not have a ‘tzone’ attribute
Warning in tclass.xts(x) : index does not have a ‘tclass’ attribute
Warning in tclass.xts(x) : index does not have a ‘tclass’ attribute
Warning in tclass.xts(object) :
index does not have a ‘tclass’ attribute
Warning in tzone.xts(object) : index does not have a ‘tzone’ attribute
# We want to see the managers names
colnames(managers)
[1] "HAM1" "HAM2" "HAM3" "HAM4" "HAM5" "HAM6" "EDHEC.LS.EQ" "SP500.TR"
[9] "US.10Y.TR" "US.3m.TR"
# and we want to see from when to when the data is available
first(managers)
index does not have a ‘tclass’ attributeindex does not have a ‘tzone’ attributeindex does not have a ‘tclass’ attribute
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6 EDHEC.LS.EQ SP500.TR US.10Y.TR US.3m.TR
1996-01-31 0.0074 NA 0.0349 0.0222 NA NA NA 0.034 0.0038 0.00456
last(managers)
index does not have a ‘tclass’ attributeindex does not have a ‘tzone’ attributeindex does not have a ‘tclass’ attribute
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6 EDHEC.LS.EQ SP500.TR US.10Y.TR US.3m.TR
2006-12-31 0.0115 -0.0062 0.011 0.0206 0.0317 0.0215 0.0153 0.01403 -0.0155 0.00441
# the Ham1-Ham6 are the asset returns we want to explain --> y in our model
asset.names <- colnames(managers[,1:6])
# the edhec, sp500 & US Treasury they are the explanatory ones --> x in our model
factor.names <- colnames(managers[,7:9])
# Typically, factor models are fit using excess returns. If the asset and factor returns are not in excess return form, "rf.name" can be specified to convert returns into excess returns.
rf.name <- "US.3m.TR"
# Similarly, market returns can be specified via "mkt.name" to add market-timing factors to the factor model.
mkt.name <- "SP500.TR"
The default model fitting method is LS regression and the default variable selection method is “none” (that is, all factors are included in the model). The different model fitting options are: 1. least squares (LS), 2. discounted least squares (DLS) and 3. robust regression fitting (Robust)
And variableselection options are: 1. “stepwise”, 2. “subsets” and 3. “lars”
The default for rf.name and mkt.name are NULL. If rf.name is not specified by the user, perhaps because the data is already in excess return form, then no risk-free rate adjustment is made. Similarly, if mkt.name is not specified, market-timing factors are not added to the model. All other optional control parameters passed through the ellipsis are processed and assimilated internally by fitTsfm.control.
# The series have unequal histories in this sample and “fitTsfm“ removes asset-wise incomplete cases (asset’s return data combined with respective factors’ return data) before fitting a factor model.
args(fitTsfm)
function (asset.names, factor.names, mkt.name = NULL, rf.name = NULL,
data = data, fit.method = c("LS", "DLS", "Robust"), variable.selection = c("none",
"stepwise", "subsets", "lars"), control = fitTsfm.control(...),
...)
NULL
# Single Index Model using SP500
fit.singleIndex <- fitTsfm(asset.names=asset.names,
factor.names="SP500.TR", #specfic factor!
rf.name="US.3m.TR",
data=managers)
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attribute
# fitted object from the time-series LS regression of asset returns on estimated factors.
fit.singleIndex$asset.fit
$HAM1
Call:
lm(formula = HAM1 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.005775 0.390071
$HAM2
Call:
lm(formula = HAM2 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.009093 0.338394
$HAM3
Call:
lm(formula = HAM3 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.006216 0.552323
$HAM4
Call:
lm(formula = HAM4 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.00403 0.69141
$HAM5
Call:
lm(formula = HAM5 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.001733 0.320833
$HAM6
Call:
lm(formula = HAM6 ~ ., data = reg.xts, model = TRUE, x = FALSE,
y = FALSE, qr = TRUE)
Coefficients:
(Intercept) SP500.TR
0.007837 0.323541
# specifics values
fit.singleIndex$alpha
fit.singleIndex$beta
# Interpretation:
# if the market return rises 1%, then the return of Ham1 rises 0,39%
# R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. The definition of R-squared is fairly straight-forward; it is the percentage of the response variable variation that is explained by a linear model
# R-squared is always between 0 and 100%:
# 0% indicates that the model explains none of the variability of the response data around its mean.
# 100% indicates that the model explains all the variability of the response data around its mean
fit.singleIndex$r2
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.43386770 0.16731517 0.43409179 0.31480051 0.08286005 0.26006315
# Interpretation:
# R-squared: 1 would be 100% - linear function matches perfectly with the data --> here we have low R-squared
# The residuals show how far the data fall from the regression line and assess how well the line describes the data.
fit.singleIndex$resid.sd
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01934497 0.03343043 0.02737911 0.04428621 0.04413790 0.02061739
# Interpretation:
# here we have low residuals
class(fit.singleIndex)
[1] "tsfm"
# time series factor model
names(fit.singleIndex)
[1] "asset.fit" "alpha" "beta" "r2" "resid.sd"
[6] "call" "data" "asset.names" "factor.names" "mkt.name"
[11] "fit.method" "variable.selection"
Overview of the single factor linear fits for the assets.
fit.singleIndex
Call:
fitTsfm(asset.names = asset.names, factor.names = "SP500.TR",
rf.name = "US.3m.TR", data = managers)
Model dimensions:
Factors Assets Periods
1 6 132
Regression Alphas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
(Intercept) 0.005775 0.009093 0.006216 0.00403 0.001733 0.007837
Factor Betas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
SP500.TR 0.3901 0.3384 0.5523 0.6914 0.3208 0.3235
R-squared values:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.43387 0.16732 0.43409 0.31480 0.08286 0.26006
Residual Volatilities:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01934 0.03343 0.02738 0.04429 0.04414 0.02062
# Interpretation:
# How good does the single index model fits to the data?
# Ham1 equals a linear regression the most --> fits the best --> R-squared is the highest
# Ham5 does not really fit to this mode --> alfa and R-squared values
plot(fit.singleIndex, which=12, f.sub=1)
Market timing accounts for the price movement of the general stock market relative to fixed income securities. This includes the down.market factor –> max(0, Rf-Rm) To test market timing ability, this factor can be added to the single index model as shown below. The coefficient of this down-market factor can be interpreted as the number of “free” put options on the market provided by the manager’s markettimings kills. That is, a negative value for the regression estimate would imply a negative value for market timing ability of the manager.
# Henriksson-Merton's market timing model
fit.mktTiming <- fitTsfmMT(asset.names=asset.names,
mkt.name="SP500.TR", # specify which of the columns in data corresponds to the market returns using argument mkt.name.
rf.name="US.3m.TR",
data=managers)
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attribute
t(fit.mktTiming$beta)
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
SP500.TR 0.4498075 0.1194041 0.56146555 0.9537383 0.35642547 0.2750302
down.market -0.1251174 0.4569814 -0.01914824 -0.5494516 -0.08451445 0.1060467
# Interpretation:
# Test market timing ability:
# when the value of down.market is negative, the ability of market timing of a manager is low --> not even there
# so the manager 2 has the best ability of market timing and after that manager 6 --> they have the hightes intercept (which return they will make when the market makes no return)
fit.mktTiming$r2
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.43816793 0.19598508 0.43414205 0.33395246 0.08344676 0.26314396
# Interpretation:
# R^2 -> how good the data fits to the model.
# all value have increased slightly
fit.mktTiming$resid.sd
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01934591 0.03298423 0.02748381 0.04383181 0.04442091 0.02074238
# Interpretation:
# volatility: how much it jumps around relative to its relationship to an index(sp500)
# risk: the higher the worse
###fit methods
#ls = least squares
#dls = discounted least squares (weightes least squares)
#robust = is good for data with outliers
#comparison
fit.mktTiming
Call:
fitTsfm(asset.names = asset.names, factor.names = factor.names,
rf.name = NULL, data = dat.xts, fit.method = fit.method,
variable.selection = "none", control = control)
Model dimensions:
Factors Assets Periods
2 6 132
Regression Alphas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
(Intercept) 0.007927 0.00102 0.006546 0.01348 0.003029 0.006344
Factor Betas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
SP500.TR 0.4498 0.1194 0.56147 0.9537 0.35643 0.275
down.market -0.1251 0.4570 -0.01915 -0.5495 -0.08451 0.106
R-squared values:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.43817 0.19599 0.43414 0.33395 0.08345 0.26314
Residual Volatilities:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01935 0.03298 0.02748 0.04383 0.04442 0.02074
fit.singleIndex
Call:
fitTsfm(asset.names = asset.names, factor.names = "SP500.TR",
rf.name = "US.3m.TR", data = managers)
Model dimensions:
Factors Assets Periods
1 6 132
Regression Alphas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
(Intercept) 0.005775 0.009093 0.006216 0.00403 0.001733 0.007837
Factor Betas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
SP500.TR 0.3901 0.3384 0.5523 0.6914 0.3208 0.3235
R-squared values:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.43387 0.16732 0.43409 0.31480 0.08286 0.26006
Residual Volatilities:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01934 0.03343 0.02738 0.04429 0.04414 0.02062
Fits Model:
The different model fitting options are:
# The next example performs LS regression using all 3 available factors in the dataset.
fit.ols <- fitTsfm(asset.names=asset.names,
factor.names=factor.names, # all 3 available factors: the edhec, sp500 & US.10Y.TR/US Treasury
rf.name="US.3m.TR",
data=managers)
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attribute
fit.ols$beta
# Interpretation:
# now we consider all factors (explanatory factors)
# Sensitivity:
# when the return of edhec rises 1% --> Ham2 rises 0,1547%
# when sp500 return rises 1%, Ham2 decreases by 0,195%
# when US.10Y.TR return rises 1%, Ham2 decreases by 0,0504%
fit.ols$r2
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.5005212 0.5144445 0.6569522 0.4125014 0.2322926 0.5644128
# Interpretation:
# how good does the data fit to the model
# Ham3 fits the best with 66%
fit.ols$resid.sd
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01888065 0.02534926 0.02161320 0.04271503 0.04093181 0.01608030
# Interpretation:
# Volatility
# How much they jump around --> most Ham4 0.0427
fit.robust <- fitTsfm(asset.names=asset.names,
factor.names=factor.names,
rf.name="US.3m.TR",
data=managers,
fit.method="Robust") # Method "Robust"!
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeLoading required package: RobStatTM
Loading required package: fit.models
Attaching package: ‘fit.models’
The following object is masked from ‘package:PortfolioAnalytics’:
center
Attaching package: ‘RobStatTM’
The following object is masked from ‘package:robustbase’:
BYlogreg
The following object is masked from ‘package:datasets’:
stackloss
fit.robust$beta
# Interpretation:
fit.robust$r2
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.3786223 0.3143965 0.6166690 0.3978801 0.2879938 0.5598650
# Interpretation:
# R-squared is now lower for each
# maybe they all had outliers
fit.robust$resid.sd
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01871677 0.02227978 0.01591560 0.03893553 0.02970911 0.01712927
# Interpretation:
par(mfrow=c(2,1))
plot(fit.ols, plot.single=TRUE, which=1, asset.name="HAM3")
mtext("LS", side=3)
Error in mtext("LS", side = 3) : plot.new has not been called yet
plot(fit.robust, plot.single=TRUE, which=1, asset.name="HAM3")
mtext("Robust", side=3)
Error in mtext("Robust", side = 3) : plot.new has not been called yet
par(mfrow=c(1,2))
plot(fit.ols, which=5, xlim=c(0,0.045), sub="LS")
plot(fit.robust, which=5, xlim=c(0,0.045), sub="Robust")
Though the R-squared values improved by adding more factors in fit.ols (compared to the single index model)
One might prefer to employ variable selection methods such as “stepwise”, “subsets” or “lars” to avoid over-fitting. The method can be selected via the variable.selection argument. The default “none”, uses all the factors and performs no variable selection:
fit.lars <- fitTsfm(asset.names=asset.names,
factor.names=factor.names,
data=managers,
rf.name="US.3m.TR",
variable.selection="lars")
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attribute
fit.lars
Call:
fitTsfm(asset.names = asset.names, factor.names = factor.names,
rf.name = "US.3m.TR", data = managers, variable.selection = "lars")
Model dimensions:
Factors Assets Periods
3 6 132
Regression Alphas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
[1,] 0.005371 0.001508 -0.00124 -0.000983 -0.002843 0.004038
Factor Betas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
EDHEC.LS.EQ 0.2678 1.3404 1.2509 1.1290 1.176 1.2503
SP500.TR 0.2872 -0.1051 0.1312 0.2392 . -0.1752
US.10Y.TR -0.2302 . 0.1437 . 0.242 -0.1739
R-squared values:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.5005 0.5064 0.6570 0.4054 0.2212 0.5644
Residual Volatilities:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01880 0.02534 0.02152 0.04261 0.04067 0.01595
# Interpretation:
# Subset --> the best performing subset within a range of subset sizes is chosen
fit.sub <- fitTsfm(asset.names=asset.names,
factor.names=factor.names,
data=managers,
rf.name="US.3m.TR",
variable.selection="subsets",
nvmin=2, nvmax=2)
index does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attributeindex does not have a ‘tclass’ attribute
fit.sub
Call:
fitTsfm(asset.names = asset.names, factor.names = factor.names,
rf.name = "US.3m.TR", data = managers, variable.selection = "subsets",
nvmin = 2, nvmax = 2)
Model dimensions:
Factors Assets Periods
3 6 132
Regression Alphas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
(Intercept) 0.006144 0.00063 -0.000903 -0.00183 -0.003459 0.00366
Factor Betas:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
EDHEC.LS.EQ . 1.545 1.2447 1.2279 1.2944 1.2208
SP500.TR 0.3716 -0.199 0.1193 0.2847 . -0.1191
US.10Y.TR -0.2324 . . . 0.3379 .
R-squared values:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.4669 0.5137 0.6508 0.4100 0.2241 0.5422
Residual Volatilities:
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.01884 0.02526 0.02171 0.04262 0.04087 0.01635
# Here, the best subset of size 2 for each asset is chosen by specifying nvmin = nvmax = 2. Note that when nvmin < nvmax, the best subset is chosen from a range of subset sizes [nvmin, nvmax]. Default is nvmin = 1.
# Interpretation:
# we see all together
# intercepts = alpha
# where we see the indices --> betas
plot(fit.sub, which=2, f.sub=1:3)
plot(fit.lars, which=2, f.sub=1:3)
Comparing the coefficients and R-squared values from the two models, we find that the method that uses more factors for an asset have higher R-squared values as expected. However, when both “lars” and “subsets” chose the same number of factors, “lars” fits have a slightly higher R-squared values.
Many useful generic accessor functions are available for “tsfm” fit objects:
Argument se.type, one of “Default”, “HC” or “HAC”, allows for heteroskedasticity and auto-correlation consistent estimates and standard errors whenever possible. A “summary.tsfm” object is returned which contains a list of summary objects returned by “lm”, “lm.Rob” or “lars” for each asset fit.
methods(class="tsfm")
[1] coef fitted fmCov fmEsDecomp fmSdDecomp fmVaRDecomp plot portEsDecomp
[9] portSdDecomp portVaRDecomp portVolDecomp predict print repRisk residuals riskDecomp
[17] summary
see '?methods' for accessing help and source code
All estimated coefficients from the LS fit using all 3 factors
coef(fit.ols)
NA
Compare returns data with fitted and residual values for HAM1 from fit.lars
HAM1.ts <- merge(fit.lars$data[,1],
fitted(fit.lars)[,1],
residuals(fit.lars)[,1])
colnames(HAM1.ts) <- c("HAM1.return","HAM1.fitted","HAM1.residual")
tail(HAM1.ts)
HAM1.return HAM1.fitted HAM1.residual
2006-07-31 -0.01863 0.001310394 -0.019940394
2006-08-31 0.01169 0.008784678 0.002905322
2006-09-30 0.00224 0.008701977 -0.006461977
2006-10-31 0.03889 0.017346504 0.021543496
2006-11-30 0.00740 0.011519868 -0.004119868
2006-12-31 0.00709 0.015633989 -0.008543989
# Interpretation:
# fitted --> the returns which can be explained through the model
# residual --> the returns which cannot be explained through the model
summary(fit.sub, se.type="HAC")
Call:
fitTsfm(asset.names = asset.names, factor.names = factor.names,
rf.name = "US.3m.TR", data = managers, variable.selection = "subsets",
nvmin = 2, nvmax = 2)
Factor Model Coefficients:
Asset1: HAM1
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.00614 0.00181 3.40 0.0009 ***
SP500.TR 0.37163 0.04930 7.54 7.4e-12 ***
US.10Y.TR -0.23242 0.07091 -3.28 0.0013 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.467, Residual Volatility: 0.0188
Asset2: HAM2
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.00063 0.00239 0.26 0.792
EDHEC.LS.EQ 1.54468 0.24583 6.28 5.9e-09 ***
SP500.TR -0.19897 0.10595 -1.88 0.063 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.514, Residual Volatility: 0.0253
Asset3: HAM3
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.000903 0.001871 -0.48 0.63029
EDHEC.LS.EQ 1.244687 0.328379 3.79 0.00024 ***
SP500.TR 0.119303 0.124063 0.96 0.33822
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.651, Residual Volatility: 0.0217
Asset4: HAM4
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.00183 0.00428 -0.43 0.66977
EDHEC.LS.EQ 1.22790 0.34069 3.60 0.00046 ***
SP500.TR 0.28467 0.12115 2.35 0.02046 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.41, Residual Volatility: 0.0426
Asset5: HAM5
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.00346 0.00396 -0.87 0.38
EDHEC.LS.EQ 1.29442 0.28326 4.57 1.9e-05 ***
US.10Y.TR 0.33791 0.21106 1.60 0.11
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.224, Residual Volatility: 0.0409
Asset6: HAM6
(HAC Standard Errors & T-stats)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.00366 0.00286 1.28 0.20
EDHEC.LS.EQ 1.22084 0.21548 5.67 4.2e-07 ***
SP500.TR -0.11910 0.10905 -1.09 0.28
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
R-squared: 0.542, Residual Volatility: 0.0163
# the factor model covariance from a fitted factor model.
fmCov(fit.sub)
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
HAM1 0.000661 0.000257 0.000411 0.000527 0.000286 0.000231
HAM2 0.000257 0.001297 0.000710 0.000807 0.000631 0.000545
HAM3 0.000411 0.000710 0.001334 0.001024 0.000732 0.000601
HAM4 0.000527 0.000807 0.001024 0.003051 0.000855 0.000689
HAM5 0.000286 0.000631 0.000732 0.000855 0.002348 0.000529
HAM6 0.000231 0.000545 0.000601 0.000689 0.000529 0.000720
# factor model return correlation plot
plot(fit.sub, which=8)
# fmSdDecomp performs a decomposition for all assets in the given factor model fit object
decomp <- fmSdDecomp(fit.sub)
names(decomp)
[1] "Sd.fm" "mSd" "cSd" "pcSd"
# All Information together
decomp
$Sd.fm
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.0257 0.0360 0.0365 0.0552 0.0485 0.0268
$mSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0101 0.0284 -0.005874 0.733
HAM2 0.0141 0.0178 -0.002272 0.701
HAM3 0.0162 0.0284 -0.002941 0.594
HAM4 0.0126 0.0242 -0.002367 0.772
HAM5 0.0106 0.0165 0.000953 0.843
HAM6 0.0159 0.0215 -0.002621 0.609
$cSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0000 0.01054 0.001365 0.01381
HAM2 0.0218 -0.00354 0.000000 0.01772
HAM3 0.0202 0.00339 0.000000 0.01291
HAM4 0.0154 0.00689 0.000000 0.03289
HAM5 0.0137 0.00000 0.000322 0.03447
HAM6 0.0194 -0.00256 0.000000 0.00996
$pcSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0 41.00 5.309 53.7
HAM2 60.6 -9.82 0.000 49.2
HAM3 55.4 9.29 0.000 35.3
HAM4 28.0 12.48 0.000 59.6
HAM5 28.2 0.00 0.664 71.1
HAM6 72.4 -9.54 0.000 37.1
# get:
decomp$Sd.fm
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
0.0257 0.0360 0.0365 0.0552 0.0485 0.0268
# the factor model standard deviation for all assets
decomp$cSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0000 0.01054 0.001365 0.01381
HAM2 0.0218 -0.00354 0.000000 0.01772
HAM3 0.0202 0.00339 0.000000 0.01291
HAM4 0.0154 0.00689 0.000000 0.03289
HAM5 0.0137 0.00000 0.000322 0.03447
HAM6 0.0194 -0.00256 0.000000 0.00996
# the component contributions to Sd
decomp$mSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0101 0.0284 -0.005874 0.733
HAM2 0.0141 0.0178 -0.002272 0.701
HAM3 0.0162 0.0284 -0.002941 0.594
HAM4 0.0126 0.0242 -0.002367 0.772
HAM5 0.0106 0.0165 0.000953 0.843
HAM6 0.0159 0.0215 -0.002621 0.609
# the marginal factor contributions to Sd
decomp$pcSd
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0 41.00 5.309 53.7
HAM2 60.6 -9.82 0.000 49.2
HAM3 55.4 9.29 0.000 35.3
HAM4 28.0 12.48 0.000 59.6
HAM5 28.2 0.00 0.664 71.1
HAM6 72.4 -9.54 0.000 37.1
# the percentage component contributions to Sd
# plot the percentage component contributions to Sd
plot(fit.sub, which=9, f.sub=1:3)
VaR = The value at risk for a given probability level indicates the amount of loss that will not be exceeded within a given period of time with this probability
decomp1 <- fmVaRDecomp(fit.sub)
names(decomp1)
[1] "VaR.fm" "n.exceed" "idx.exceed" "mVaR" "cVaR" "pcVaR"
# All Information together
decomp1
$VaR.fm
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
-0.0290 -0.0335 -0.0440 -0.0833 -0.0752 -0.0357
$n.exceed
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
7 7 7 7 4 4
$idx.exceed
$idx.exceed$HAM1
[1] 32 69 79 81 84 85 125
$idx.exceed$HAM2
[1] 51 57 58 61 74 78 122
$idx.exceed$HAM3
[1] 32 38 55 57 78 81 84
$idx.exceed$HAM4
[1] 32 58 68 69 79 81 100
$idx.exceed$HAM5
[1] 58 61 68 70
$idx.exceed$HAM6
[1] 79 103 112 118
$mVaR
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 -0.01169 -0.0289 0.005761 -0.899
HAM2 -0.00997 -0.0206 0.005903 -0.880
HAM3 -0.01871 -0.0540 0.008118 -0.656
HAM4 -0.01368 -0.0353 -0.000279 -1.326
HAM5 -0.01020 -0.0291 -0.000389 -1.513
HAM6 -0.01705 -0.0273 0.009859 -1.108
$cVaR
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0000 -0.01075 -0.001339 -0.0169
HAM2 -0.0154 0.00410 0.000000 -0.0222
HAM3 -0.0233 -0.00645 0.000000 -0.0143
HAM4 -0.0168 -0.01005 0.000000 -0.0565
HAM5 -0.0132 0.00000 -0.000131 -0.0618
HAM6 -0.0208 0.00326 0.000000 -0.0181
$pcVaR
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0 37.01 4.611 58.4
HAM2 45.9 -12.22 0.000 66.3
HAM3 52.9 14.66 0.000 32.4
HAM4 20.1 12.06 0.000 67.8
HAM5 17.6 0.00 0.175 82.3
HAM6 58.3 -9.13 0.000 50.8
# get the factor model value-at-risk for all assets
decomp1$VaR.fm
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
-0.0290 -0.0335 -0.0440 -0.0833 -0.0752 -0.0357
# get the percentage component contributions to VaR
decomp1$pcVaR
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0 37.01 4.611 58.4
HAM2 45.9 -12.22 0.000 66.3
HAM3 52.9 14.66 0.000 32.4
HAM4 20.1 12.06 0.000 67.8
HAM5 17.6 0.00 0.175 82.3
HAM6 58.3 -9.13 0.000 50.8
# plot the percentage component contributions to VaR
plot(fit.sub, which=11, f.sub=1:3)
The term risk measure is a collective term for statistical measures that can be used to quantitatively describe the uncertainty of an event. VaR is defined as the amount of loss that will not be exceeded in a given period of time with a specified probability p (“confidence level” α = 1 - p).
decomp2 <- fmEsDecomp(fit.sub, method="historical")
names(decomp2)
[1] "ES.fm" "mES" "cES" "pcES"
# get the factor model expected shortfall for all assets
decomp2$ES.fm
HAM1 HAM2 HAM3 HAM4 HAM5 HAM6
-0.0538 -0.0370 -0.0586 -0.1153 -0.1067 -0.0411
# get the component contributions to Sd
decomp2$cES
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.00000 -0.02592 -0.00441 -0.0235
HAM2 -0.01232 0.00111 0.00000 -0.0258
HAM3 -0.02715 -0.00873 0.00000 -0.0227
HAM4 -0.03576 -0.02132 0.00000 -0.0582
HAM5 -0.00309 0.00000 0.00302 -0.1067
HAM6 -0.02737 0.00422 0.00000 -0.0180
# get the marginal factor contributions to ES
decomp2$mES
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 -0.02593 -0.06975 0.01898 -1.25
HAM2 -0.00798 -0.00559 0.00658 -1.02
HAM3 -0.02181 -0.07315 0.01157 -1.05
HAM4 -0.02912 -0.07488 0.01364 -1.37
HAM5 -0.00239 -0.00771 0.00894 -2.61
HAM6 -0.02242 -0.03544 0.01181 -1.10
# get the percentage component contributions to ES
decomp2$pcES
EDHEC.LS.EQ SP500.TR US.10Y.TR Residuals
HAM1 0.0 48.1 8.19 43.7
HAM2 33.3 -3.0 0.00 69.7
HAM3 46.3 14.9 0.00 38.8
HAM4 31.0 18.5 0.00 50.5
HAM5 2.9 0.0 -2.83 99.9
HAM6 66.6 -10.3 0.00 43.7
# plot the percentage component contributions to ES
plot(fit.sub, which=10, f.sub=1:3)
plot(fit.sub, which=6)
# Make a plot selection (1-12 or 0 to exit)
plot(fit.sub, plot.single=TRUE, asset.name="HAM1", which=10)
plot(fit.sub, plot.single=TRUE, asset.name="HAM1", which=14)
grid()
plot(fit.sub, plot.single=TRUE, asset.name="HAM1", which=11)
plot(fit.sub, plot.single=TRUE, asset.name="HAM1", which=12)
Follow the file ffmVignette.pdf and interpret your results.
The general mathematical form of equity fundamental factor model (FFM) implemented in factorAnalytics is rt = αt· 1 + Bt−1ft + εt, t = 1, 2, · · · , T (1)
where: - equity returns vector rt and the vectors 1 and - εt are N ×1 vectors, - Bt−1 is an N ×K exposures (risk factors) matrix, and - ft is a K × 1 vector of random factor returns. - It is assumed that the εt hav zero mean with diagonal covariance matrix Dt = diag(σ2t,1, σ2t,2, · · · , σ2t,N), and are uncorrelated with the ft
# factorAnalytics
# The following U.S. stock returns and data sets are included in factorAnalytics for now:
#
# 1) factorDataSetDjia:
# - Monthly returns of 30 stocks in the DJIA from January 2000 to March 2013 = Dow Jones Industrial Average
# - with 5 corresponding style factors:
# - MKTCAP (Market Capitalization)
# - ENTVAL (Entitled Value)
# - P2B () Platform to Business
# - EV2S ()
# - SIZE ()
# - and a sector factor with 9 sectors:
# - ENERGY (Energy),
# - COSTAP (Consumer Staples),
# - INDUST (Industry),
# - MATRLS (Materials),
# - FINS (Financials),
# - INFOTK (Information Technology),
# - HEALTH (Healthcare)
# - CODISC (Consumer Discretionary),
# - TELCOM (Telecommunications)]
#
#
# 2) factorDataSetDjia5Yrs:
# - A five-year segment of factorDataSetDjia from January 2008 to December 2012.
#
#
# 3) wtsDjiaGmvLo:
# - Weight vector of dimension 30 containing the weights of a long-only
# global minimum variance (GMV) portfolio for the 30 stocks in the factorDataSetDjia5Yrs data set.
# - The weight vector was obtained using PortfolioAnaltyics with the usual sample covariance matrix
# based on the 5 years of returns in factorDataSetDjia5Yrs.
data("factorDataSetDjia5Yrs")
dataDjia5Yr = factorDataSetDjia5Yrs
head(dataDjia5Yr,5)
A FFM is generally fit by a two-step method: 1) using least squares or robust regression methods in the first step, 2) and using weighted least squares or weighted robust regression in the second step where the weights are computed in the first step.
# The Fundamental-Factor-Model-Function: fitFfm
fitDjia5Yr = fitFfm(dataDjia5Yr, addIntercept = T,
asset.var = "TICKER", # asset.var = name of variable for asset names
ret.var = "RETURN", # ret.var = name of variable for asset returns
date.var = "DATE", # date.var = name of variable containing the dates
exposure.vars = c("SECTOR","SIZE", "P2B", "EV2S"), # exposrue.var = variables containing the fundamental factor exposures
z.score = "crossSection") # z.score has to be one of "none", "crossSection", or "timeSeries"
names(fitDjia5Yr)
[1] "factor.fit" "beta" "factor.returns" "residuals" "r2" "factor.cov"
[7] "g.cov" "resid.cov" "return.cov" "restriction.mat" "resid.var" "call"
[13] "time.periods" "data" "date.var" "ret.var" "asset.var" "exposure.vars"
[19] "weight.var" "fit.method" "asset.names" "factor.names" "activeWeights" "activeReturns"
[25] "IR"
One of the most basic model fit statistics is the R-squared, and you can assess the goodness of your FFM fits by using the function ffmRsq to make a plot of the time series of R-squared values for each of the 60 fits over the five-year window. This function computes and plots the time series of ordinary R-squared by default, but it can do that for the adjusted R-squared, or both.
# Fit R-Squared Values
fmRsq(fitDjia5Yr,
rsq = T, # logical; if TRUE, Factor Model R-squared values are computed for the portfolio. Default is TRUE.
rsqAdj = T, # logical; if TRUE, Adjusted R-squared values are computed for the portfolio. Default is FALSE.
plt.type = 2, # 1 indicates barplot, 2 indicates time series xy plot. Default is 2.
isPrint = F,
lwd = 0.7,
stripText.cex = 0.8,
axis.cex = 0.8)
Mean R-Squared Mean Adj R-Squared
0.78 0.54
# Interpretation:
# 1) Upper Plot: Mean R-Squared across periods = 0.78 (Output of the Junk)
# -> problem, R-Squared increases everytime an independant variable is added, which means the more independant variables the higher the R-Squared.
# 2) Mean adjusted R-Squared = 0.54 (Output of the Junk)
# -> as soon as we take the problem with R-Squared into account and use the adjusted mean, we see that the value drops from 78% to 54% and that there
# are even negative single values.
When your model includes continuous style factor variables the function ffmRsq also allows you to compute and display the time series of variance inflation factors (VIF’s). These can help you determine whether or not there are any regression collinearity problems.
# the time series of mean variance inflation factors (VIF’s)
vif(fitDjia5Yr,
isPlot = T,
isPrint = F,
lwd = 0.7,
stripText.cex = 0.8,
axis.cex = 0.8)
$Mean.VIF
SIZE P2B EV2S
1.14 1.09 1.15
#Multikollinearität ist ein Problem der Regressionsanalyse und liegt vor, wenn zwei oder mehr erklärende Variablen eine sehr starke Korrelation miteinander haben. Zum einen wird mit zunehmender Multikollinearität das Verfahren zur Schätzung der Regressionskoeffizienten instabil und Aussagen zur Schätzung der Regressionskoeffizienten zunehmend ungenau. Zum anderen ist die Modellinterpretation nicht mehr eindeutig. Das klassische Symptom von starker Multikollinearität ist ein hohes Bestimmtheitsmaß einhergehend mit niedrigen t-Werten für die einzelnen Regressionsparameter.
Plots of the time series of t-statistics for the factors in an FFM fitted model, with horizontal dashed lines provided to judge whether or not a factor is significant at the 5% level, may be obtained with the function ffmTstats.
# The time series of t-statistics for the factors (5% significant level)
fmTstats(fitDjia5Yr,
whichPlot = "tStats",
color = "blue",
lwd = 0.7,
layout = c(3, 4),
stripText.cex = 0.8,
axis.cex = 0.8)
# Interpretation:
# fmTstats can also compute the number of significant t-statistics:
# the number of positive,
# negative and
# total significant t-statistics
fmTstats(fitDjia5Yr,
whichPlot = "significantTstatsV",
color = "blue",
stripText.cex = 0.8,
axis.cex = 0.8,
layout = c(3, 4))
# Interpretation: 3 style factors + market + 8 sectors
# red lines shows critical values t statistic; everything larger in (abolute value) then critical value is signnificant
# result of colinearity -> not many significant parameters
# see below: relativley low amount of significant parameters (execept for market)
The factorAnalytics package contains the following risk and performance reporting functions: 1) repExposures 2) repReturn 3) repRisk
We illustrate the use of each of these in turn using “fitDjia5Yr” and the corresponding global minimum variance long-only portfolio weights object “wtsDjiaGmvLo”
# load the long-only global minimum variance portfolio weights vector
data(wtsDjiaGmvLo)
# and give it the shorter name wtsDjia
wtsDjia = wtsDjiaGmvLo
The portfolio exposure to a given risk factor is the inner (dot) product of the portfolio weight vector with the column of the exposures matrix Bt−1 corresponding to the given factor. The style factors vary over time, but the sector factors are fixed and each sector is represented by a column of zero’s and ones. Thus the portfolio exposure to style factors will vary over time and thus have a distribution with a mean and volatility. On the other hand the portfolio exposure to sector factors will have a fixed value depending on the portfolio weights and number of firms in a given sector.
# compute volatilities of the style factors and sector factors
repExposures(fitDjia5Yr,
wtsDjia,
isPlot = F,
digits = 1,
stripText.cex = 0.8,
axis.cex = 0.8)
$Style.Exposures
Mean Volatility
SIZE 89.6 4.6
P2B 25.2 20.0
EV2S -31.5 6.8
$Sec.Exposures
# Interpretation:
# Exposure generally refers to the fact of being exposed to a risk.
# Just three factors have a volatility:
# 1) Style factor P2B: with 20.0.
# 2) Style factor EV2S: with -31.5
# 3) Stlye factor Size: with 4.6
# plot the volatilities of the style factors and sector factors
repExposures(fitDjia5Yr,
wtsDjia,
isPrint = F,
isPlot = T,
which = 3,
add.grid = F,
zeroLine = T,
color = "Blue")
# plot the time series of the style factor exposures
repExposures(fitDjia5Yr,
wtsDjia,
isPrint = F,
isPlot = T,
which = 1,
add.grid = F,
zeroLine = T,
color = "Blue",
stripText.cex = 0.8,
axis.cex = 0.8)
#display boxplots of those exposures
repExposures(fitDjia5Yr,
wtsDjia,
isPrint = FALSE,
isPlot = TRUE,
which = 2,
notch = F,
layout = c(3,3))
The function repReteurn provides you with the following choices of graphical reports, the results of which will also be printed because of the default printing option isPrint = T: 1. Time Series plot of portfolio returns decomposition 2. Time Series plot of portfolio style factors returns 3. Time Series plot of portfolio sector returns 4. Boxplot of Portfolio Returns Components.
# caluclate mean and volatility of the various portfolio return components with the help of repReturn
repReturn(fitDjia5Yr,
wtsDjia,
isPlot = FALSE,
digits = 2)
Mean Volatility
PortRet 0.85 3.78
ResidRet 0.10 1.46
FacRet 0.76 3.30
Market 0.48 5.78
SIZE 0.16 2.49
P2B 0.14 0.51
EV2S 0.06 0.60
COSTAP 0.01 1.66
ENERGY -0.03 0.97
FINS 0.00 0.00
HEALTH 0.02 0.46
INDUST 0.00 0.00
INFOTK -0.10 0.73
MATRLS 0.00 0.00
TELCOM 0.01 0.37
# Interpretation:
# Plot portfolio returns decomposition
repReturn(fitDjia5Yr,
wtsDjia,
isPrint = FALSE,
isPlot = TRUE,
which = 1,
add.grid = TRUE,
scaleType = "same",
color = "Blue",
stripText.cex = 0.8,
axis.cex = 0.8)
# Interpretation:
# Plot portfolio styles factor return
repReturn(fitDjia5Yr,
wtsDjia,
isPrint = FALSE,
isPlot = TRUE,
which = 2,
add.grid = TRUE,
zeroLine = T,
color = "Blue",
scaleType = "same",
stripText.cex = 0.8,
axis.cex = 0.8)
# Interpretation:
# Plot portfolio sectors return
repReturn(fitDjia5Yr,
wtsDjia,
isPrint = FALSE,
isPlot = TRUE,
which = 3,
add.grid = TRUE,
zeroLine = T,
color = "Blue",
scaleType = "same",
stripText.cex = 0.8,
axis.cex = 0.8)
# Interpretation:
# Boxplot of Portfolio Returns Components
repReturn(fitDjia5Yr,
wtsDjia,
isPrint = FALSE,
isPlot = TRUE,
which = 4)
# Interpretation:
The function repRisk allows one to compute and display (graphically and in tabular form) factor decompositions of risk for a portfolio and for each of the assets used to fit a fundamental factor model with fitFfm. The risk measure can be chosen as: - standard deviation/volatility (SD) - expected shortfall (ES) - or value-at-risk (VaR)
and the factor risk decomposition can be chosen as: - factor percent contribution to risk (FPCR) - factor contribution to risk (FCR) - or factor marginal contribution to risk (FMCR).
# standard deviation/volatility (SD)
# factor percent contribution to risk - (FPCR)
# First we compute an FPCR decomposition of the portfolio and individual assets using Sd as the risk measure, and provide both Lattice visualization and tabular displays. For the Lattice display the argument sliceby = “factor” specifies that the panel conditioning is by risk factor and the choice layout = c(5,1) results in a single row with five panels. We used nrowPrint = 10 to shorten the printed output from one row for the portfolio factor risk decomposition and 22 rows (we are only using 22 of the DJIA stocks at the moment) for the stock factor risk decompositions to one row for the portfolio and 9 rows for the assets.
fitDjia5YrIntStyle = fitFfm(data = dataDjia5Yr,
exposure.vars = c("SIZE", "P2B", "EV2S"),
date.var = "DATE",
ret.var = "RETURN",
asset.var = "TICKER",
fit.method = "WLS",
z.score = "crossSection",
addIntercept = T)
repRisk(fitDjia5YrIntStyle,
wtsDjia,
risk = "Sd",
decomp = "FPCR",
nrowPrint = 10,
sliceby = "factor",
isPrint = T,
isPlot = T,
layout = c(5, 1),
stripText.cex = 0.8,
axis.cex = 0.8)
$SdFPCR
Alpha SIZE P2B EV2S Resid
Portfolio 102.0 -18.3 -0.4 -1.3 18.0
AA 28.9 45.2 4.0 -2.2 24.0
BA 54.3 16.7 8.7 4.3 16.0
BAC 15.1 -0.8 2.1 11.2 72.3
CAT 37.0 12.3 -0.2 -1.4 52.4
CVX 64.1 -12.8 4.4 4.9 39.4
DD 53.1 28.0 -1.8 -1.6 22.3
GE 34.6 -8.7 2.0 17.4 54.6
HPQ 37.1 26.4 3.4 -0.2 33.3
IBM 22.3 -0.6 26.0 0.1 52.3
# Interpretation:
# expected shortfall (ES)
# factor percent contribution to risk - (FPCR)
# Now we use expected shortfall (ES) to do an FPCR decomposition and provide only the Lattice
repRisk(fitDjia5YrIntStyle,
wtsDjia,
risk = "ES",
decomp = "FPCR",
nrowPrint = 10,
sliceby = "factor",
isPrint = F,
isPlot = T,
layout = c(5, 1),
stripText.cex = 0.8,
axis.cex = 0.8)
NULL
# Interpretation:
# How many percent of standard deviation are determinded by systemic risk (size, p2b, ev2s) and idiosyncratic risk (Resid)
# expected shortfall (ES)
# factor contribution to risk (FCR)
# Now we use expected shortfall (VaR) to do the FCR decomposition
repRisk(fitDjia5YrIntStyle,
wtsDjia,
risk = "VaR",
decomp = "FPCR",
nrowPrint = 10,
sliceby = "factor",
isPrint = F,
isPlot = T,
layout = c(5, 1),
stripText.cex = 0.8,
axis.cex = 0.8)
NULL
# Interpretation:
# compare the factor risk decompositions of a portfolio using all three risk measures, skipping the risk decomposition of the assets
repRisk(fitDjia5YrIntStyle,
wtsDjia,
risk = c("Sd", "ES", "VaR"),
decomp = "FPCR",
sliceby = "factor",
isPrint = T,
isPlot = TRUE,
layout = c(5, 1),
portfolio.only = T,
stripText.cex = 0.8,
axis.cex = 0.8)
the condition has length > 1 and only the first element will be used
$`Portfolio FPCR Non-Parametric`
Total Alpha SIZE P2B EV2S Resid
Sd 100 102 -18.3 -0.4 -1.3 18.0
ES 100 180 -64.2 -10.3 -6.7 0.9
VaR 100 189 -76.6 -7.8 -3.0 -2.1
# Interpretation:
The use of factor model Monte Carlo (FMMC) for a fundamental factor model results in a simulated set of asset returns based on a resampling of factor returns and a resampling or simulation of residual returns of the fitted model, using the exposures matrix from the last time period used in fitting the model.
The factorAnalytics function fmmcSemiParametric implements the above FMMC method based on function araguments that are the result of first fitting fundamental factor model to the data with fitFfm, combined with function arguments based on user options concerning the type of FMMC. We will illustrate the use of fmmcSemiParametric on the DJIA five-year monthly data set factorDataSetDjia5Yrs.
# But first we take a look at the arguments of the function
args(fmmcSemiParam)
function (B = 1000, factor.ret, beta, alpha, resid.par, resid.dist = c("normal",
"Cornish-Fisher", "skew-t", "empirical"), boot.method = c("random",
"block"), seed = 123)
NULL
Aguments: - B is the number of bootstrap samples. - factor.ret is the set of factor returns estimates returned by the use of fitFfm - beta is exposures matrix for the final period returned by fitFfm - alpha is a fixed vector of intercept values that if ommited are assumed to be zero.
Our example below uses the default values B = 1000, boot.method = “random” (means simple bootstrap), seed = 123 (for reproducibility of the example).
We use two choices of resid.dist, first we use resid.dist = “empirical” which corresponds to 2-(a) above and then we use resid.dist = “normal”. The user must provide appropriate values for resid.par that depend on the the choice for resid.dist. For the choice resid.dist = “empirical” the resid.par must be the N × T dimensional xts time series in the $residuals component of the model fit, and for the choice resid.dist = “normal” the resid.par must be an N × 2 matrix with the first column being estimates of the means of the residuals for each of the N assets and the second column being estimates of the standard deviations of the residuals for each of the assets
# In order to use fmmcSemiParam for the DJIA data we first fit a fundamental factor model (without alpha or market term) to the factorDataSetDjia5Yrs data
data("factorDataSetDjia5Yrs")
N = 30
exposure.vars <- c("P2B", "MKTCAP", "SECTOR")
fit.ffm = fitFfm(data = factorDataSetDjia5Yrs,
asset.var = "TICKER",
ret.var = "RETURN",
date.var = "DATE",
exposure.vars = exposure.vars)
# Next we use fmmcSemiParam to create simulated values of the asset returns based on the use of bootstrapped factor returns and bootstrapped (empirical) residuals:
resid.par = fit.ffm$residuals
fmmcDat = fmmcSemiParam(B = 1000,
factor.ret = fit.ffm$factor.returns,
beta = fit.ffm$beta,
resid.par = resid.par,
boot.method = "random",
resid.dist = "empirical")
names(fmmcDat)
[1] "sim.fund.ret" "boot.factor.ret" "sim.resid"
“sim.fund.ret” = a B × N matrix of simulated asset returns, “boot.factor.ret” = a B × K matrix of simulated factor returns, “sim.resid” = a B × N matrix of simulated residuals
# Now let’s verify that the that returns of the 30 DJIA stocks over the five-year period are well represented by the set of 500 simulated sets of 30 returns in fmmcDat$sim.fund.return with respect to returns means and standard deviations. In order to do this we first extract the multivariate time series of returns of those stocks from the data frame factorDataSetDjia5Yrs
data = factorDataSetDjia5Yrs
djiaDat = tapply(data$RETURN,
list(data$DATE, data$TICKER),
I)
djiaRet = xts(djiaDat,
as.yearmon(rownames(djiaDat)))
# Now we compare the simulated returns means with the observed returns means for the first 10 DJIA stocks
round(apply(djiaRet, 2, mean)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
-0.012 0.004 0.000 0.014 0.007 0.008 0.000 -0.016 0.013 0.002
round(apply(fmmcDat$sim.fund.ret, 2, mean)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
-0.010 0.011 -0.004 0.012 0.006 0.004 -0.003 -0.011 0.012 0.003
#same thing for returns standard deviations
round(apply(djiaRet, 2, sd)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
0.137 0.089 0.197 0.126 0.064 0.093 0.109 0.090 0.056 0.081
round(apply(fmmcDat$sim.fund.ret, 2, sd)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
0.141 0.080 0.187 0.125 0.057 0.093 0.105 0.094 0.056 0.079
The use of fmmcSemiParam with bootstrapped residuals as well as bootstrapped factor returns is attractive because it is simple and because in addition to making no distributional assumptions about the factor returns it makes no assumptions about the distributions of the residuals.
By way of contrast let’s see what happens if we assume the residuals associated with the 30 DJIA fitted fundamental factor model returns have normal distributions and fit them using the sample means and standard deviations of the residuals.
# First we use fmmcSemiParam with default choice resid.dist = “normal” and with resid.par a matrix with first column the sample mean of the residuals and second column the standard deviation of teh residuals
resid.mean = apply(B = 1000,
coredata(fit.ffm$residuals),
2,
mean,
na.rm = T)
resid.sd = matrix(sqrt(fit.ffm$resid.var))
resid.par = cbind(resid.mean, resid.sd)
fmmcDatNormal = fmmcSemiParam(factor.ret = fit.ffm$factor.returns,
beta = fit.ffm$beta,
resid.par = resid.par,
boot.method = "random")
#Then we compare the means of the simulated asset returns with those of the actual returns
round(apply(djiaRet, 2, mean)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
-0.012 0.004 0.000 0.014 0.007 0.008 0.000 -0.016 0.013 0.002
round(apply(fmmcDatNormal$sim.fund.ret, 2, mean)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
-0.013 0.010 -0.001 0.010 0.006 0.006 -0.004 -0.014 0.013 0.001
#Same with the standard deviation
round(apply(djiaRet, 2, sd)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
0.137 0.089 0.197 0.126 0.064 0.093 0.109 0.090 0.056 0.081
round(apply(fmmcDatNormal$sim.fund.ret, 2, sd)[1:10], 3)
AA BA BAC CAT CVX DD GE HPQ IBM INTC
0.121 0.081 0.155 0.099 0.060 0.114 0.089 0.089 0.061 0.082
Once again the mean values agree quite well, but we see that the simulated returns based on the assumption of normally distributed returns have volatilities that under-estimate the actual returns volatilities for eight of the first 10 stocks. However, comparison of the volatilities for all 30 stocks reveals that there are only 13 stocks for which the volatilites for the simulated returns are smaller than those of the actual returns, and that when the volatilities of the simulated returns are larger than those of the actual returns they are much larger, for example .093 versus .065 for CAT and .132 versu .068 for HD
Main message: It is not safe to use normal distributions in modeling stock returns with a fundamental factor model (or otherwise). It is for this reason that fmmcSemiParam allows you to use skewed t-distributions and Cornish-Fisher quantile rerpresentation of non-normal distributions for the residuals.
In this discussion we treat the terms “Industry” and “Sector” interchangeably, noting that for some models, e.g., a U.S. equity model, one may prefer to just use sector factors but may also wish to use industry factors, and a global model with countries may also contain industry factors. Our current examples use sector factors but we refer to them in our mathematical models loosely as industry factors.
We will illustrate use of fitFfm to fit a market plus sector model to the DJIA stock returns and sector data. But first we fit a pure sector model without a market component and examine the factor return coefficients for the first month of the five-year fitting window as a reference point.
dat = factorDataSetDjia5Yrs
fitSec = fitFfm(dat,
asset.var = "TICKER",
ret.var = "RETURN",
date.var = "DATE",
exposure.vars = "SECTOR")
round(coef(summary(fitSec)$sum.list[[1]])[, 1], 3)
COSTAP ENERGY FINS HEALTH INDUST INFOTK MATRLS TELCOM
-0.007 0.034 -0.121 -0.028 -0.016 0.037 0.082 -0.080
round(fitSec$factor.returns[1, ], 3)
COSTAP ENERGY FINS HEALTH INDUST INFOTK MATRLS TELCOM
2008-02-01 -0.007 0.034 -0.121 -0.028 -0.016 0.037 0.082 -0.08
Note that the last two lines of code produce identical results. This is because without any constraints such as those discussed above, the coefficients of the cross-section regression at each time period are extracted to form the time series of factor returns in the factor.returns component of the ffm object. Now we fit a market plus sector model by adding the fitF argument addIntercept = T,and examine the coefficients gˆmi,1 and the resulting factor returns ˆfmi,1 for the first month of the fiveyear fitting window.
fitSecInt = fitFfm(dat,
asset.var = "TICKER",
ret.var = "RETURN",
date.var = "DATE",
exposure.vars = "SECTOR",
addIntercept = T)
round(coef(summary(fitSecInt)$sum.list[[1]])[, 1], 2)
g1 g2 g3 g4 g5 g6 g7 g8
-0.01 0.01 0.05 -0.11 -0.02 0.00 0.05 0.09
#excactly the same as before, but we add intercept t; therefore we add the market component
round(fitSecInt$factor.returns[1, ], 2)
Market COSTAP ENERGY FINS HEALTH INDUST INFOTK MATRLS TELCOM
2008-02-01 -0.01 0.01 0.05 -0.11 -0.02 0 0.05 0.09 -0.07
round(sum(fitSecInt$factor.returns[1, -1]), 2)
[1] 0
# Summed returns of all sector should equal to zero, due to the relationship of the sectors
Note that the next to last line of code above prints the unique least squares model coefficients vector gˆmi,1 for month 1 (9 of them since there are 9 sectors)
We have created an artificial example of a market+sector+country model (where sector plays the role of industry) consisting of random returns of 30 stocks with three sectors for the sector factor and two countries for the countries factor, for each of five months. The normally distributed returns for the three sectors alone have means of 1, 2, 3, with standard deviations .2. The two countries contribute additional normally distributed returns having means 4 and 5 with standard deviations .2. So returns associated with the first country have means 5, 6, 7 and means associated with the second country have means 6, 7, 8. Thus the overall mean of 6.5. The code for creating the returns is as follows:
# Country Incremental Components of Asset Returns
set.seed(10000)
Bind = cbind(rep(1, 30),
c(rep(1, 10), rep(0, 20)),
c(rep(0, 10), rep(1, 10), rep(0, 10)),
c(rep(0, 20), rep(1, 10)))
cty1 = matrix(rep(c(0, 1), 15))
cty2 = matrix(rep(c(1, 0), 15))
Bmic = cbind(Bind, cty1, cty2)
dimnames(Bmic)[[2]] = c("mkt", "sec1", "sec2", "sec3", "cty1", "cty2")
r.add = rnorm(30, 4, 0.2)
r.cty1 = rep(0, 30)
r.cty2 = rep(0, 30)
for (i in 1:30)
{
if (Bmic[i, "cty1"] == 1)
{
r.cty1[i] = r.add[i]
r.cty2[i] = 0
}
else
{
r.cty1[i] = 0
r.cty2[i] = r.add[i] + 1
}
}
# Asset Returns for Market+Industry+Country Model
mu = c(1, 2, 3)
sd = c(0.2, 0.2, 0.2)
r = list()
r.mic = list()
fitMic = list()
fitMic1 = list()
for (i in 1:5)
{
set.seed(1099923 + (i - 1))
r[[i]] = c(rnorm(10, mu[1], sd[1]),
rnorm(10, mu[2], sd[2]),
rnorm(10, mu[3], sd[3]))
r.mic[[i]] = r[[i]] + r.cty1 + r.cty2
}
#qq-plot of the 30 asset returns for the first of the 5 time periods
#Die Beobachtungswerte zweier Merkmale, deren Verteilung man vergleichen will, werden jeweils der Größe nach geordnet. Diese geordneten Daten werden zu Wertepaaren zusammengefasst und in einem Koordinatensystem abgetragen. Ergeben die Punkte (annähernd) eine Gerade, kann man vermuten, dass den beiden Merkmalen die gleiche Verteilung zu Grunde liegt. Problematisch ist das Verfahren, wenn von den beiden Merkmalen unterschiedlich viele Beobachtungen vorliegen. Hier kann mit Interpolationsverfahren abgeholfen werden.
qqnorm(r.mic[[1]],
main = "MIC Model Equity Returns for First Period",
xlab = "NORMAL QQ-PLOT",
ylab = "RETURNS")
NA
Now we build the data frame required by fitFfm, fit the MIC model and display the factor returns for each of the five months. What we have been calling the Industry factor is called Sector for this example
Returns = unlist(r.mic)
COUNTRY = rep(rep(c("US", "India"), 15), 5)
SECTOR = rep(rep(c("SEC1", "SEC2", "SEC3"), each = 10), 5)
TICKER = rep(c(LETTERS[1:26], paste0("A", LETTERS[1:4])), 5)
DATE = rep(seq(as.Date("2000/1/1"), by = "month", length.out = 5), each = 30)
data.mic = data.frame(DATE = as.character(DATE),
TICKER,
Returns,
SECTOR,
COUNTRY)
exposure.vars = c("SECTOR", "COUNTRY")
fit = fitFfm(data = data.mic,
asset.var = "TICKER",
ret.var = "Returns",
date.var = "DATE",
exposure.vars = exposure.vars,
addIntercept = T)
fit$factor.returns
Market SEC1 SEC2 SEC3 India US
2000-02-01 6.50 -0.964 -0.00539 0.970 -0.501 0.501
2000-03-01 6.53 -0.965 -0.02718 0.993 -0.513 0.513
2000-04-01 6.52 -0.956 -0.00954 0.966 -0.513 0.513
2000-05-01 6.58 -1.008 -0.02796 1.036 -0.516 0.516
We see that the Market values of the factor have values clustering around 6.5 as expected. We can also see that the three sector factor returns sum to zero and the two country factor returns sum to zero, as expected due to the constraints that they sum to zero.